A workshop was conducted June 28-30, 1994 at the USGS National Center in Reston, VA by representatives of the MRLC (Multi-Resolution Land Characteristics) consortium for the purpose of learning and evaluating SPECTRUM image analysis software relative to joint goals of consortium programs. The software is reasonably user-friendly, and permits satellite image data (notably Thematic Mapper) to be approached in an interpretive mode for land-use/land-cover mapping without the necessity of painstaking feature delineations. Suggestions were developed for mapping strategy, a few inconveniences were noted, and recommendations made for possible future enhancements.
SPECTRUM implements an unsupervised classification approach to multi-spectral image data. Unsupervised classification involves first "clustering" the image data to capture the major image information and then assigning clusters to categories of interest for mapping. The SPECTRUM version of the unsupervised approach was developed by Patrick M. Kelly and James M. White in the Los Alamos National Laboratory, Computer Research Group. The orginal context of development was defense intelligence. The clustering mechanism uses a nearest-neighbor algorithm giving results similar to the k-means program in the SAS statistical package, but utilizes several innovative strategies to improve speed and accommodate large data sets. A simple user's perspective for MRLC is that SPECTRUM provides a computer-assisted mode of "photointerpreting" satellite image data that is rapid, highly interactive, and does not require extensive prior experience in remote sensing. As is typical of more conventional photointerpretation, however, the quality of the final map improves with the analyst's knowledge of the landscape being mapped and with amount of ancillary information available.
A particular advantage of the system relative to clustering is that many more clusters are generated than typical for other versions of unsupervised analysis, thus capturing more of the scene information. This multiplicity of clusters is called "hyper-clustering," and enables reasonable reproduction of the scene from just the cluster information alone. Therefore, hyper-clustering also constitutes a method of image data compression. Another substantial advantage for MRLC users is that EROS Data Center will precluster the scene and provide this information in the manner of an additional image band. Thus, MRLC users need not be bothered with the clustering phase at all and can get right to the business of assigning clusters to desired map categories with the SPECTRUM software.
One begins by loading the cluster image and associated cluster information into memory of a UNIX workstation computer. The next order of business is to select three "image bands" for display on the screen. In fact, the resulting display is an approximation of the original image as rendered through the spectral band means for the several clusters. Analysts with photointerpretation experience will probably choose either a band combination that gives a "color-infrared" view or a "conventional color" view. Each has advantages for interpreting particular types of landscape features. Various "indexes" such as greenness, brightness, wetness, and so on can also be displayed if the analyst is familiar their formulation as ratios or linear combinations of spectral bands.
The desired map legend is next entered as a set of category labels for landscape features of interest (e.g., land-cover classes). Along with specifying a category label, one chooses a color to appear on the screen for "pixels" which will be placed in that category. The actual process of assigning clusters to map categories then begins. A "zoom" window is opened, and a representative sector of the image is moved into the zoom window with the mouse- driven cursor. As the cursor is moved around in the zoom window, the number of the cluster in that pixel location is displayed. One chooses a pixel location for which the map category is known from ancillary information, "ground truth," or general "lay of the land" as seen in the image display. Double clicking the location brings up a window for assigning the particular cluster number to a map category. All other pixels belonging to the same cluster then appear in the designated category color thoughout the rest of the image. Clusters can be transferred from one map category to another if desired. For those with digital image analysis experience, this latter process is very much like "training set" selection in supervised analysis.
If one is interested only in a very general categorization (perhaps water, forest, agriculture, and other), the assignment can probably be accomplished without recourse to ancillary information according to the appearance of the landscape in the image. If one is interested in a more detailed categorization (perhaps vegetation community types), it becomes necessary to adopt the traditional photointerpreter's approach to convergence of evidence using ancillary information (topo maps, soils maps, airphotos, etc.). This involves a special "highlight" category in which each cluster is temporarily placed by itself so that the distribution of its member pixels over the landscape can be viewed readily. The cluster can then be examined in terms of elevation, aspect, soils, and so on, in order to determine its characteristics relative to criteria for map categories. Although more time-consuming, it may be appropriate to run a text editor as a separate process in a window so that the characterization for each cluster can be documented in the course of interpretation. A bit of counsel based on photointerpretation experience is that careful assignment is generally more than repaid by avoidance of frustration in correcting errors later.
We would advise that you carry a typical quarter- scene (TM) through the entire process, including verification, before proceeding with the rest of your imagery. This will alert you to the likely pitfalls for the remainder of work, give you a good sense of expected accuracy, and perhaps reveal some category confusion that simply cannot be resolved in this particular mode of mapping. In the latter case, you should plan on refining your draft map by subsequent exploitation of other sources of information.
Phenology is very important in separating land- use/land-cover and vegetation classes on the basis of spectral information. The scene with which we experimented in the workshop was clustered as a composite of two images, one from early summer (June) and the other from fall (late in October). This is a particularly advantageous combination relative to phenology, and the composite clustering is much better than having the same two scenes clustered separately.
The composite gives rise to a large number of clusters, several of which are likely to represent the same map category. It is much easier, however, to assign several clusters to the same map category than to face the prospect of lack of separability between categories. A given forest type may be in different stages of fall color change as a result of elevation differences, giving several clusters for the same category. However, such changes also permit detecting conifers in mixture with hardwoods and induce crop differences associated with senescence or harvest. More ancillary information may be needed to account for phenological distinctions between clusters, but the distinctions at least become possible. Dual dates also allow working under clouds as long as the clouds do not coincide in both images.
Working with a multi-date composite will require the interpreter to alternate views of the image. It will be necessary to switch back and forth between early-season infrared and late-season infrared, perhaps along with conventional color for one or both dates. Multiple dates also increase the importance of learning expected spectral signatures, which are levels of differing reflectance between bands and dates for particular types of features. SPECTRUM makes available a signature profile (plot of band means) when an instance of a cluster number is pending category assignment.
Multi-date composites will complicate the prospect of preclustering at the EROS data center. EROS may find it logistically impractical to precluster in different combinations of years and dates. This will serve as motivation for user sites to undertake their own clustering.
Provision for Refinement
It would be unrealistic to expect that the foregoing SPECTRUM scenarios will adequately address all map categories for all thematic contexts. Thus, it is only prudent to anticipate possible need for further refinement after you have done your best in SPECTRUM. SPECTRUM itself does not currently embody substantial capabilities for on- screen map editing outside the cluster environment. There are several paths by which the results of SPECTRUM work can be carried into other software systems that are better geared to editing operations. Unfortunately, the transport utilities are also not currently part of SPECTRUM per se. You are referred to remote sensing personnel at EROS Data Center for determining the most expedient import/export capability relative to your favorite GIS.
Making Spectrum More Commodious for Interpreters
SPECTRUM developers have apparently done little in the way of multi-temporal interpretation themselves, else they would have made it unnecessary to keep repeating some of the interpretive operations. The most obvious instance involves switching of image views. It is presently necessary to associate a spectral band with each color plane of the computer display each time you want a different view. When you have once set up a view in this manner, it should be possible to "save" the view under some name so that it can be reselected easily when it is needed again. We strongly urge that such a capability be added to SPECTRUM in its next version.
Equally annoying is the need to specify a numeric level of color for each plane in assigning a color to a category. Susan Benjamin currently has a sheet of paper that associates color levels with color names. We wholeheartedly encourage the incorporation of name-based color selection as an option in SPECTRUM. However, the capability to specify colors by numeric level should also be retained.
We also view as practical necessity the ability to "quick save" and retrieve the status of category assignments along with cluster means by cluster and band number to/from an ASCII file. This would not only allow interruption/resumption of worksessions and going-back to prior stages, but also local programming of bridgework to statistical packages.
Procurement and Platforms
SPECTRUM was developed to run in the Khoros software environment on UNIX workstation computers. It is possible to obtain Khoros with SPECTRUM by anonymous FTP through the Internet. If interest lies solely in SPECTRUM, however, one should seek a stand-alone version from EROS Data Center.
It must also be noted that all UNIX workstations are not created equal relative to SPECTRUM. SPECTRUM saw its first intensive use on Data General platforms at the workshop. While individually and collectively instructive, the workshop was not thematically productive due to frequent lock-up of the DGs during SPECTRUM sessions. Such problems have not occurred on Sun workstations. Version 2.0 of SPECTRUM is due for release in September and will have been tested on DGs.
Wish List for Sophisticated Analysts
We would like to:
a) Have current cluster enter scatter plot last so that color/position is not obscured by plotting of other clusters;
b) Have optional scatter plots on principal component axes;
c) Examine the spectral heterogeneity of individual clusters (standard deviations to go with means);
d) Retain the seed for a cluster and examine its relation to the ultimate cluster mean;
e) Examine the spectral heterogeneity of clusters assigned to a thematic category;
f) Explore the prospective addition of clusters to a thematic class on the basis of spectral similarity;
g) Create supercategories of categories for spectral comparison;
h) Explore the intercluster spectral structure though higher-dimensional displays and/or collapsing dendrogram;
i) Create spatial partitions of a spectral cluster for separate labeling by polygonal enclosure with cursor;
j) Have capability for explicit seeding of clusters, including cluster means from other scenes that may not actually exist as a pixel in present scene;
k) Restrict Monte Carlo sampling with an exclusionary binary mask, i.e. cluster for multiple strata;
l) Display multiple spectral reflectance curves, ie. display curves for deciduous forest types to compare ęcharacteristic' spectral signatures;
m) Save a library of spectral reflectance curves;
n) Build a menu of ęstandard' indices or formulas, i.e. greenness, wetness, brightness, etc. so the user doesn't have to type them in.
Wayne Myers, Penn State University
Gail Thelin, USGS-WRD NAWQA
Susan Benjamin, NMD NASA-AMES Research Center
Ann Raspberry, Maryland, DNR
Joy Hood, EROS Data Center
Paul Etzler, EMSL, Las Vegas, NV
Jim Majure, Iowa State University
John Brakebill, USGS-WRD Potomac NAWQA
Pat Green, EPA-EMAP Forest, RTP, NC
John Findley, USGS-NMD, Reston, VA