Abstract
A workshop was conducted June 28-30, 1994 at the USGS National
Center in Reston, VA by representatives of the MRLC
(Multi-Resolution Land Characteristics) consortium for the
purpose of learning and evaluating SPECTRUM image analysis
software relative to joint goals of consortium programs. The
software is reasonably user-friendly, and permits satellite image
data (notably Thematic Mapper) to be approached in an
interpretive mode for land-use/land-cover mapping without the
necessity of painstaking feature delineations. Suggestions were
developed for mapping strategy, a few inconveniences were noted,
and recommendations made for possible future enhancements.
Introduction
SPECTRUM implements an unsupervised classification approach to
multi-spectral image data. Unsupervised classification involves
first "clustering" the image data to capture the major
image information and then assigning clusters to categories of
interest for mapping. The SPECTRUM version of the unsupervised
approach was developed by Patrick M. Kelly and James M. White in
the Los Alamos National Laboratory, Computer Research Group. The
orginal context of development was defense intelligence. The
clustering mechanism uses a nearest-neighbor algorithm giving
results similar to the k-means program in the SAS statistical
package, but utilizes several innovative strategies to improve
speed and accommodate large data sets. A simple user's
perspective for MRLC is that SPECTRUM provides a
computer-assisted mode of "photointerpreting" satellite
image data that is rapid, highly interactive, and does not
require extensive prior experience in remote sensing. As is
typical of more conventional photointerpretation, however, the
quality of the final map improves with the analyst's knowledge of
the landscape being mapped and with amount of ancillary
information available.
A particular advantage of the system relative to clustering is
that many more clusters are generated than typical for other
versions of unsupervised analysis, thus capturing more of the
scene information. This multiplicity of clusters is called
"hyper-clustering," and enables reasonable reproduction
of the scene from just the cluster information alone. Therefore,
hyper-clustering also constitutes a method of image data
compression. Another substantial advantage for MRLC users is that
EROS Data Center will precluster the scene and provide this
information in the manner of an additional image band. Thus, MRLC
users need not be bothered with the clustering phase at all and
can get right to the business of assigning clusters to desired
map categories with the SPECTRUM software.
Mapping Scenario
One begins by loading the cluster image and associated cluster
information into memory of a UNIX workstation computer. The next
order of business is to select three "image bands" for
display on the screen. In fact, the resulting display is an
approximation of the original image as rendered through the
spectral band means for the several clusters. Analysts with
photointerpretation experience will probably choose either a band
combination that gives a "color-infrared" view or a
"conventional color" view. Each has advantages for
interpreting particular types of landscape features. Various
"indexes" such as greenness, brightness, wetness, and
so on can also be displayed if the analyst is familiar their
formulation as ratios or linear combinations of spectral bands.
The desired map legend is next entered as a set of category
labels for landscape features of interest (e.g., land-cover
classes). Along with specifying a category label, one chooses a
color to appear on the screen for "pixels" which will
be placed in that category. The actual process of assigning
clusters to map categories then begins. A "zoom" window
is opened, and a representative sector of the image is moved into
the zoom window with the mouse- driven cursor. As the cursor is
moved around in the zoom window, the number of the cluster in
that pixel location is displayed. One chooses a pixel location
for which the map category is known from ancillary information,
"ground truth," or general "lay of the land"
as seen in the image display. Double clicking the location brings
up a window for assigning the particular cluster number to a map
category. All other pixels belonging to the same cluster then
appear in the designated category color thoughout the rest of the
image. Clusters can be transferred from one map category to
another if desired. For those with digital image analysis
experience, this latter process is very much like "training
set" selection in supervised analysis.
If one is interested only in a very general categorization
(perhaps water, forest, agriculture, and other), the assignment
can probably be accomplished without recourse to ancillary
information according to the appearance of the landscape in the
image. If one is interested in a more detailed categorization
(perhaps vegetation community types), it becomes necessary to
adopt the traditional photointerpreter's approach to convergence
of evidence using ancillary information (topo maps, soils maps,
airphotos, etc.). This involves a special "highlight"
category in which each cluster is temporarily placed by itself so
that the distribution of its member pixels over the landscape can
be viewed readily. The cluster can then be examined in terms of
elevation, aspect, soils, and so on, in order to determine its
characteristics relative to criteria for map categories. Although
more time-consuming, it may be appropriate to run a text editor
as a separate process in a window so that the characterization
for each cluster can be documented in the course of
interpretation. A bit of counsel based on photointerpretation
experience is that careful assignment is generally more than
repaid by avoidance of frustration in correcting errors later.
We would advise that you carry a typical quarter- scene (TM)
through the entire process, including verification, before
proceeding with the rest of your imagery. This will alert you to
the likely pitfalls for the remainder of work, give you a good
sense of expected accuracy, and perhaps reveal some category
confusion that simply cannot be resolved in this particular mode
of mapping. In the latter case, you should plan on refining your
draft map by subsequent exploitation of other sources of
information.
Multi-Temporal Mapping
Phenology is very important in separating land- use/land-cover
and vegetation classes on the basis of spectral information. The
scene with which we experimented in the workshop was clustered as
a composite of two images, one from early summer (June) and the
other from fall (late in October). This is a particularly
advantageous combination relative to phenology, and the composite
clustering is much better than having the same two scenes
clustered separately.
The composite gives rise to a large number of clusters, several
of which are likely to represent the same map category. It is
much easier, however, to assign several clusters to the same map
category than to face the prospect of lack of separability
between categories. A given forest type may be in different
stages of fall color change as a result of elevation differences,
giving several clusters for the same category. However, such
changes also permit detecting conifers in mixture with hardwoods
and induce crop differences associated with senescence or
harvest. More ancillary information may be needed to account for
phenological distinctions between clusters, but the distinctions
at least become possible. Dual dates also allow working under
clouds as long as the clouds do not coincide in both images.
Working with a multi-date composite will require the interpreter
to alternate views of the image. It will be necessary to switch
back and forth between early-season infrared and late-season
infrared, perhaps along with conventional color for one or both
dates. Multiple dates also increase the importance of learning
expected spectral signatures, which are levels of differing
reflectance between bands and dates for particular types of
features. SPECTRUM makes available a signature profile (plot of
band means) when an instance of a cluster number is pending
category assignment.
Multi-date composites will complicate the prospect of
preclustering at the EROS data center. EROS may find it
logistically impractical to precluster in different combinations
of years and dates. This will serve as motivation for user sites
to undertake their own clustering.
Provision for Refinement
It would be unrealistic to expect that the foregoing SPECTRUM
scenarios will adequately address all map categories for all
thematic contexts. Thus, it is only prudent to anticipate
possible need for further refinement after you have done your
best in SPECTRUM. SPECTRUM itself does not currently embody
substantial capabilities for on- screen map editing outside the
cluster environment. There are several paths by which the results
of SPECTRUM work can be carried into other software systems that
are better geared to editing operations. Unfortunately, the
transport utilities are also not currently part of SPECTRUM per
se. You are referred to remote sensing personnel at EROS Data
Center for determining the most expedient import/export
capability relative to your favorite GIS.
Making Spectrum More Commodious for Interpreters
SPECTRUM developers have apparently done little in the way of
multi-temporal interpretation themselves, else they would have
made it unnecessary to keep repeating some of the interpretive
operations. The most obvious instance involves switching of image
views. It is presently necessary to associate a spectral band
with each color plane of the computer display each time you want
a different view. When you have once set up a view in this
manner, it should be possible to "save" the view under
some name so that it can be reselected easily when it is needed
again. We strongly urge that such a capability be added to
SPECTRUM in its next version.
Equally annoying is the need to specify a numeric level of color
for each plane in assigning a color to a category. Susan Benjamin
currently has a sheet of paper that associates color levels with
color names. We wholeheartedly encourage the incorporation of
name-based color selection as an option in SPECTRUM. However, the
capability to specify colors by numeric level should also be
retained.
We also view as practical necessity the ability to "quick
save" and retrieve the status of category assignments along
with cluster means by cluster and band number to/from an ASCII
file. This would not only allow interruption/resumption of
worksessions and going-back to prior stages, but also local
programming of bridgework to statistical packages.
Procurement and Platforms
SPECTRUM was developed to run in the Khoros software environment
on UNIX workstation computers. It is possible to obtain Khoros
with SPECTRUM by anonymous FTP through the Internet. If interest
lies solely in SPECTRUM, however, one should seek a stand-alone
version from EROS Data Center.
It must also be noted that all UNIX workstations are not created
equal relative to SPECTRUM. SPECTRUM saw its first intensive use
on Data General platforms at the workshop. While individually and
collectively instructive, the workshop was not thematically
productive due to frequent lock-up of the DGs during SPECTRUM
sessions. Such problems have not occurred on Sun workstations.
Version 2.0 of SPECTRUM is due for release in September and will
have been tested on DGs.
Wish List for Sophisticated Analysts
We would like to:
a) Have current cluster enter scatter plot last so that
color/position is not obscured by plotting of other clusters;
b) Have optional scatter plots on principal component axes;
c) Examine the spectral heterogeneity of individual clusters
(standard deviations to go with means);
d) Retain the seed for a cluster and examine its relation to the
ultimate cluster mean;
e) Examine the spectral heterogeneity of clusters assigned to a
thematic category;
f) Explore the prospective addition of clusters to a thematic
class on the basis of spectral similarity;
g) Create supercategories of categories for spectral comparison;
h) Explore the intercluster spectral structure though
higher-dimensional displays and/or collapsing dendrogram;
i) Create spatial partitions of a spectral cluster for separate
labeling by polygonal enclosure with cursor;
j) Have capability for explicit seeding of clusters, including
cluster means from other scenes that may not actually exist as a
pixel in present scene;
k) Restrict Monte Carlo sampling with an exclusionary binary
mask, i.e. cluster for multiple strata;
l) Display multiple spectral reflectance curves, ie. display
curves for deciduous forest types to compare ęcharacteristic'
spectral signatures;
m) Save a library of spectral reflectance curves;
n) Build a menu of ęstandard' indices or formulas, i.e.
greenness, wetness, brightness, etc. so the user doesn't have to
type them in.
Workshop Participants:
Wayne Myers, Penn State University
Gail Thelin, USGS-WRD NAWQA
Susan Benjamin, NMD NASA-AMES Research Center
Ann Raspberry, Maryland, DNR
Joy Hood, EROS Data Center
Paul Etzler, EMSL, Las Vegas, NV
Jim Majure, Iowa State University
John Brakebill, USGS-WRD Potomac NAWQA
Pat Green, EPA-EMAP Forest, RTP, NC
John Findley, USGS-NMD, Reston, VA