



















 |
|
14.
NIMA and Its Information Architecture--A Clean Sheet
As mentioned previously,
the Commission is enthusiastic about the Director's reformulation of NIMA
as custodian of the US Information and Geospatial Service (USIGS). Sometimes
misunderstood, this reformulation is emblematic of a healthy change in
focus, away from systems, away from products, away from processes, and
toward information services. This is not to say that NIMA will no longer
produce its hallmark products: maps and imagery intelligence products.
As NIMA focuses on information services, the hardcopy maps and reports
are byproducts--intentionally useful derivatives, but not the essence
of NIMA.
A critical consequence
of the reformulation is the need to get the information architecture just
right. Otherwise, the future extensibility of USIGS will be severely limited.
New applications will not be able to flower.
A sub-panel of the
Commission took a look at a possible architecture unconstrained by any
legacy issues--a "clean sheet" was the starting point for a top-level
design exercise. The conclusion of the sub-panel, endorsed by the Commission
as a whole, is that to support NIMA's transition to an information service,
the USIGS information architecture must become "data-centric." To anticipate
the discussion, this means that all TPED processes--and subsequent analytic
processes, as well--become transactions against the database, each deriving
value from, and adding value to, the database.
14.1
The Importance of Architecture
The importance of
focusing considerable energy on NIMA's information architecture cannot
be overstated. NIMA is embarked on a major acquisition initiative for
its tasking, processing, exploitation, and dissemination (TPED) process,
which will, for better or worse, solidify its information architecture
for a decade or two to come. The Commission fears that, left to its own
devices, NIMA's information architecture could well remain system/function-centric,
structured around discrete systems purchases made several hundred million
dollars at a time. While these systems could be individually coherent,
and would likely meet current stated requirements, they would neither
position NIMA to take full and continuing advantage of the revolution
in information technology, nor interface gracefully to systems and processes
as yet unimagined.
To oversimplify slightly,
the Commission is inclined to believe that TPED and other major applications
would be best served if NIMA were to develop a new architecture, a new
process by which to acquire this architecture, and a new organizational
form to take advantage of it. The new architecture would be built upon
a distributed database that integrates geospatial and imagery information--and
can extend to encompass information derived from other "INTs". The new
process would adopt COTS to the maximum useful extent, built in terms
of periodic increments, and cut back on requirements for systems integration.
The new organization would focus NIMA on its emerging role as content
provider for the Global Information Grid (GIG).
It is with temerity
that the Commission offers for consideration this more detailed discussion,
not to provide a blueprint, but to illustrate how fundamental changes
in architecture create fresh possibilities--yes, and raise new issues.
It should neither be accepted uncritically, nor discarded petulantly.
It should serve merely to illustrate how rethinking TPED without preconceptions
can inform the structure and composition of NIMA's information systems,
and indeed, NIMA itself. The Commission realizes that insofar as there
are sound ideas here, they are neither unique to the Commission, nor absent
in NIMA's own thinking.
14.2
Toward a New Architecture
Only half jokingly
has NIMA, in its current configuration, been described as "two communities
separated by a common agency." Imagery analysis, with its intelligence
heritage, is quite comfortable with its functionality allocated as TPED.
Geospatial analysis, with its cartographic heritage, is less well served
by the TPED nomenclature and more at home with order entry tracking (OET)
and work flow management (WFM). While either argot could be adapted to
(or adopted by) either community, the data-centric construct accommodates
both. The Commission cautiously asserts that beyond being an inclusive
construct, data-centricity is a unifying construct.
NIMA is perched on
the edge of a systems acquisition that will influence its information
environment for years to come. This provides NIMA with a unique opportunity
to consolidate its information architecture. The Commission believes that
NIMA's information infrastructure should be built around an integrated
data architecture, not around a collage of systems, nor products nor processes.42
Actually, the Commission's view is grander still. If done skillfully,
NIMA would become the architect, if not the custodian of the Geospatial
Information System for the larger national security community--intelligence
and operations, diplomatic and military, strategic and tactical.
This "mother of all
databases"43
at the center should be the conceptualization, if not the container of
all the national security community's geo-referenced (and time-tagged)
information.44
Indeed, nearly all relevant information is, or could profitably be geo-referenced.
"The Central Database"--which need be neither singular nor centralized--must
be widely and easily shared among users and, in the first instance, should
hold vector data (the stuff of maps) and raster data (the stuff of images)
as a seamlessly packaged whole. The database should be structured to be
independent of client or application, fully distributed, and capable of
accepting successive value-additions and user annotations. These features
would depart from NIMA's current information architecture (though some
of NIMA's as-yet-unimplemented plans pull in that direction).
14.3
A Database to Support the TPED Process

As shown in the accompanying
illustration, such a database could constitute the primary--not necessarily
sole--support for the imagery TPED process; indeed, it would support any
number of TPED processes as such.
All TPED functionality--from
requirements and tasking, to data reception, processing, exploitation,
and dissemination--can be seen as transactions against a database. That
this database may be parsed, distributed, replicated, aggregated, and
so on is key. Transactions--the value added to data in the database--need
not adhere to the sequential implications of traditional TPED interpretation.
14.4
Tasking, Processing, Exploitation, and Dissemination as Transactions
Tasking flows from
an expression of information needs and logically starts with an investigation
of what already exists--Are the data in a database? Is the product already
in inventory? If so, pull it. If not, order it. Ask that it be pushed
to you, or ask to be advised as to when it is available to be pulled.
In the "back office" the order is processed--pulled from a queue, or pushed
to the fulfillment process. Different views--depending upon whether one
is in front of the counter or behind the counter--which can be reconciled
as transactions against a database. Much can be relegated to server applications:
notification, standing taskings, and the like.
Processing, in the
first instance, refers to turning the information downlinked from the
satellite (in what we might refer to as a "proprietary" format) into a
"picture" ready for exploitation, on film or on soft-copy. Processing
operations are, generally, done for each picture and so it makes sense
to do these prior to the exploitation phase, on large capable hardware
close to the downlink entry point. If and when exploitation operations
become so routinized that they can be done automatically--say, change
detection--then that process might well migrate from the exploitation
segment and move "upstream" into the processing segment. In organizational
terms, this could mean that NIMA cedes control and execution of these
processes to the National Reconnaissance Office (NRO) or commercial operator.
No matter who, insofar as the original downlinked information is archived,
then successive processing operations can, too, be seen as transactions
against a database.
In the same sense,
the succession of value-added exploitation steps can be seen as transactions
against the database. The (copy of the) image is pulled from the database,
value is added, and the modifications and/or modified picture are written
back into the database. Thus, exploitation can also be seen, as in the
accompanying figure, as a series of transactions (involving imagery but
also related vector information), which can continually enrich the database
with new features (e.g., a newly discovered double-perimeter fence
line) and annotations upon old features.

Dissemination--the
intellectual task of deciding to whom information should go, as distinct
from distribution, which is the process of carriage--entails both "push"
and "pull." In the former case, a background process--driven, say, by
tables that codify users' expressions of needs and wants--runs against
new postings to the database and sends that information, or a notice of
new information to the desirous users. In the pull case, users run queries
against the database holdings. Indeed, if the query language allows the
user to specify not only how far back in the archive the search should
be conducted, but also how far into the future, the distinction between
push and pull logically disappears.
We have taken the
liberty, in the preceding discussion, to pretend that there is actually
one integral database. That need not be the case, and some would argue
that in terms of implementation, no one database could possibly satisfy
all. But, the master geo-referenced database still holds its position
as the logical source of and sink for NIMA work.
14.5
Vector-Raster Integration
The NIMA database
ought to permit clients to access vector and raster information in an
integrated fashion--i.e., "normalized" to each other so that the user
can drape one over the other seamlessly and transparently. As the accompanying
figure suggests, image analysts themselves may be able to do their jobs
better by being able to see "through" images into underlying geospatial
data (or take advantage of geospatial analysis that may indicate, for
instance, likely hiding areas for SCUDs; see A Tale of Two Cities,
elsewhere in this report).

Today, such a database
would naturally contain "chips" of an image--e.g., polygons containing
interesting pieces of the larger image. Today, the polygon would be determined
by geospatial coordinates--say, a rectangle 2km by 3km centered on a set
of geo-coordinates, the "aim point." Eventually, we can expect the chips
to be determined more by imagery content--a building, or a compound, or
the right-of-way along a road. In either case, a goal is to accommodate
better the "bandwidth-challenged" user--fielded forces, those at sea,
or airborne. Even with conventional compression, the "last tactical mile"
generally constrains us from sending full-size images, which will, themselves,
get larger with the next generation of imagery satellites just about as
fast as bandwidth will increase. So, the ability to combine vector-map
data (which are generally compact for the area covered) with imagery extracts
of key visual features, may be the best of all worlds.

14.6
Product, Application, and Client Independence
For many users, NIMA
still is defined by its catalog of standard map products, paper or CD-ROM.45
The Commission believes, however, that such products are better thought
of as renderings of datasets extracted for specific purposes from a larger
database. Users themselves create "products" from the database that NIMA
provisions. A "standard" product becomes one where a script has been generated
to ensure some uniformity in the data extraction and rendering.
Where once NIMA's
job was to make maps, tomorrow its job will be to provision the database
and ensure the availability of applications that enable a user (or another
application) to call for data using a combination of coordinates, scale,
feature sets, and in some cases, currency (what time period is relevant)
from an integrated database. Data should be accessible through multiple
methods, as shown in the accompanying figure. GIS data can also be used
(and thus should be formatted to easily be used) as an input to planning,
modeling and simulation, and planners may be able to exploit the database
without ever having to see a map or an image.

The ability to call
on NIMA's database through standardized function calls should be a capability
that others can build into their products. The separation of client and
server functions through modular interfaces also eases the systems integration
problems (the importance of which is discussed below). Support must be
provided for both thick clients with software powerful enough to manipulate
and finish the product and thin clients which can only display a map as
a picture but cannot manipulate it as data. Overall, the user interface
should be a function, not of the database, but of the user's requirements.

Making GIS data broadly
accessible via standard protocols permits anyone to build new applications
for users. This frees NIMA from having to guess how its data will be used,
and allows unanticipated uses to flourish. The data provider simply cannot
be prescient enough to anticipate all the uses to which the data will
be put. Traditionally, however, data can be seen only through conforming
applications, and manipulated only through routines built into the applications
themselves. The software behind the Common Operational Picture (COP: the
real-time view of the battlefield), for instance, has no macro language.
Best commercial practice, however, avoids this dead end, and so, too,
must NIMA.

14.7
Location Independence
The "NIMA database"
can (and should) be distributed both physically and virtually. As the
accompanying figure illustrates, it suffices that one node "know" where
all the relevant data sits; the many data streams that go into a GIS system
may sit in various locations (and be managed by various owners within
and without NIMA) as long as their interconnections--through the GIG,
say--are sufficiently robust. Storage, communications and processing all
trade off against each other and best effect can be achieved when a single
architect has the freedom to make all the tradeoffs--i.e., to globally
optimize the network design.
"Ownership" of data
ought to be divorced from locality. There is no need to invest the CINCs
with responsibility to hold and manage a set of images taken with national
assets over its AOR (area of operational responsibility); it is not even
clear that information acquired with theater assets (e.g., UAVs)
ought to be part of an exclusive CINC image library as well. True, leaving
the command image libraries in place may be optimal from the networking
point of view--as long as they are globally accessible. But how users
"see" the database can be expected to vary only with their employer, clearance,
and need to know.
14.8
Annotation
The "NIMA database"
must support value-added contributions from anyone, anywhere--the database
must host user-supplied annotation. This opens it to a good deal of informed
(but, alas, also uninformed) commentary but it also gives users a stake
in understanding the GIS database because of their ability to contribute
to it. (Although the emergence of client-to-client programs, such as Napster,
suggest the distinction between clients and servers is eroding, all NIMA
information should be server-accessible because client connections are
uncertain and security implications of client-to-client connectivity have
yet to be fully explored).
Over time, annotations
should become a very significant part of the total database. Indeed, the
value of having the database capture the feedback of users (both from
DoD and the rest of the Intelligence Community) could rival that of the
database itself. Annotation should be understood as exactly that: not
the official database, itself, but commentary thereon. Thus, NIMA would
retain responsibility for the master plot.
14.9
The Need for a Rigorous Data Model
In developing an architecture
for the NIMA database a rigorous data model inherently comes first. All
other decisions (such as the systems model) ought to follow, not lead.
Such a data model can be conceptualized as the three concentric rings
of the accompanying figure. In the center are the core scalable database
and network structures (i.e., the processing, storage, and distribution
engines).

In the middle ring
are the basic data types of a GIS: raster data, vector data, features
data, networks, grids, TINs (triangulated irregular networks), fundamental
objects etc. In the outer ring are constructed objects (e.g., a
street, a multi-spectral image, a vertical obstruction, an "urbanized
area"). Such a data model, therefore, would contain a definition of feature
classes, metadata, and symbology.
14.10
Ways to Absorb Data from Third Parties
Commercial GIS users
are beginning to benefit from the widespread sharing of data sets. NIMA
need not create all the information it provides. NIMA already has information-sharing
agreements with many governments, and prospects for further sharing appear
likely. Datasets can be acquired from other US departments and agencies,
as well as from industry.
There are many data
sets (e.g., where embassies are located) that other entities (e.g.,
the State Department) can affordably keep track of much more accurately
than can NIMA, itself. There is no good reason for NIMA not to mirror
such databases within its own system (mirroring eliminates the very significant
problem of combining classified data with unclassified data and second,
of thin or unreliable connections to third party servers).
Overall, the more
NIMA's data model is compatible with counterpart data models used by the
USGS, NOAA, FEMA, major allies, or key NGOs (e.g., the World Bank)--the
better. NIMA is best off adapting and adopting commercial standards that
work. But where standards do not yet exist, NIMA has to step in to foster
their creation to permit greater interoperability and collaboration. The
VPF format used in VMAP was developed by NIMA; its success was verified
when others (e.g., NATO) adopted it. It helped that NIMA reached
out to the community in developing VPF and like activities in the future
should have as much participation of the commercial world as they can
get.
14.11
Methods to Deal with Logical Inconsistencies
At one level, logical
consistency appears to be the sine qua non of a map. Roads are expected
to connect, boundary lines to join at their edges, and most buildings
sit over land not water.
Unfortunately, although
reality may be consistent, databases often are not, especially when they
come from different sources, or were made at different times. (both may
have been right when made but may have been made at different times).
The traditional approach--make it right--may not be the best. The desire
to make things consistent inhibits incremental database updating in favor
of explicit versioning. Flagging contradictions may be better than arbitrarily
declaring one right and one wrong.
14.12
Methods to Separate Public from Restricted Information
NIMA's total information
base can be divided into what is unrestricted and what is restricted--either
by license and agreement or because of sources and methods. Currently
almost all of NIMA's digital cartographic products are restricted for
one or another reason. NIMA should continue to exert care in not confusing
the protection of intellectual property with the protection of sources
and methods so that legitimate government users need not have a security
clearance merely to access "the database" for information that is not
classified. The discerning reader will recognize the need for separation,
yet integration of information as that old bugaboo of multi-level security.
The Commission has no answer other than to suggest that multiple levels
of security is a here and now solution. The paradigm shift that is hard
for some to make is to do database operations at the lowest possible level
(not "policy high") and then replicate the data to higher levels. To NIMA's
credit, they seem to understand this. NIMA will also benefit from the
DOD-wide rollout of a Public Key Infrastructure (PKI) and a concerted
effort at Information Warfare Defense/Defensive Information Operations
(IWD/DIO) designed to preserve the confidentiality, integrity, non-repudiateability
and availability of essential information. And fortunately, although security
is an area where the federal government often leads the private sector,
commercial firms have increasing motivation to solve this problems of
protection of intellectual property and privacy of proprietary data.
14.13
New Data Types
"The database" should
be capable of holding new data types such as HSI, video, SAR-MTI and urban
data. Each presents its own problems and taxes the extensibility of database
design and the prescience of the data model. No simple answers are at
hand except an open mind.
Powerful examples
of the benefits of fusing multiple sources of intelligence are widely
known, even if less-widely emulated. The challenge for NIMA is to ensure
that its data model and database designs do not constrain the incorporation
of new data types.

The logic of using
geo-referencing to break the tyranny of the intelligence stovepipes is
clear. Thus, the burden of multi-INT integration falls on NIMA--NIMA is
clearly the enterprise to organize such an endeavor by virtue of its deep
geospatial knowledge and its capacious storage and networking capability
(even if, as argued further below, it needs more technological capability
to assume the job.
14.14
Precision and Persistence
Resolution, or ground
sample distance (GSD), are watchwords in the imagery world. Information
differs in how accurately it can be measured. Imagery (both EO and synthetic
aperture radar), for instance, can be accurate to the sub-meter level--but
not always: e.g., MSI, HSI, and USI, for technical reasons, have
successively less resolution, and correspondingly less geospatial precision.
ELINT data are even less precise; so is most acoustic and seismic information.
Most weather data are measured over kilometers.

Information also differs
to the extent that accurate measurement is meaningful. Some phenomena
are inherently fuzzy. Neither the habitat of a species, nor the turf of
a gang, the catchment area of a shopping center, or the track of a storm
can be usefully measured in meters. Assigning geospatial attributions
to other phenomena is a stretch. Rumors, for instance, about impending
governmental decisions in Ethiopia may be geospatially tagged to a specific
office building in downtown Addis Ababa, but such tagging feels artificial
or at least of questionable value since its source and impact may be geospatially
distant from the office. Some information has no real geospatial content
whatsoever: the characteristics of a weapons system, or reports on an
impending religious schism.
It is pointless to
give geospatial information more precision than is warranted. But every
datum has to be anchored to some location in a geospatial database.
Persistence marks
NIMA's products; evanescence marks the Common Operating Picture (COP).
Yet, persistence is not a binary attribute. Take the accompanying figure.
A mountain pass is forever. Successively, a paved road that traverses
the pass, a gravel trail that leads off the road, an assembly point for
mobile-missile launchers and finally, the Scud in flight are increasingly
fleeting. Nevertheless, sensor-based data, for instance, of mobile objects
acquires context, in large part, from a background of immobile objects.
Accounting for trucks requires accounting for roads and passes, in a sense.
So where is the proper
boundary between "NIMA's data" and that which makes up the Common Operating
Picture (COP)? To what extent should NIMA's data model be built for eventual
extension into the COP data model? Good questions, but no good answers,
as yet.
14.15
Toward Multi-INT integration
The Commission believes
that any architecture recommended by NIMA has to be able to evolve to
a multi-INT architecture. Clear minds will separate this from the questions
of who should implement and who should pay for the implementation.
NIMA should begin
to engineer a broader architecture by which such INTs can be captured
and presented in a coherent fashion. In its simplest form, other-INT data
should be available as layers normalized to NIMA data. From whichever
layer the user starts, he must be able to drill down to access the other
information.

Multi-INT database(s),
as they emerge, should take advantage of the inherent parallelism in TPED
processes across the various INTs--as the accompanying figure suggests,
every INT, as a general proposition involves tasking, collection, processing,
exploitation, and dissemination.
Still, it is important
to note that the relationships among tasking, collection, and processing
vary by INT. It is also important to note that this multi-INT architecture
does not need to spring into being all at once. We can replace components
as dollars and ideas permit, and invest in those areas that provide the
highest payoff.
Serious thought is
needed on how to manage a federation of databases, separately budgeted,
with crosscutting management structures. Perhaps an intermediate but high-level
interagency group could coordinate the overall data model, and the underlying
technology standards, as well as sponsoring consulting and training. DIA's
Joint Intelligence Virtual Architecture (JIVA) provides a model for consideration.
Finally--despite the
Commission's enthusiasm--it is worth remembering that geo-referencing
is not the only way to look at a mass of data.
14.16
Conclusions of the "Clean Sheet" Exercise
Building NIMA's architecture
around a database that integrates maps and images and other relevant intelligence
data, making this database independent of location and client, and permitting
third-party annotation to it together constitutes the core recommendations
for the information architecture.
Radical approaches
like these are less risky than they sound. People have been doing data-centric
architectures and databases for many decades, and GIS databases for at
least two of them. The commercial industry is mature in all respects:
workstations, databases, and GIS. Commercial capabilities already exist
to do most of the imagery and geospatial manipulation that NIMA could
want. NIMA is not being asked to approach this architectural requirement
in a way and with a degree of effort that no one has ever done before;
it is asked to apply familiar methods to its problems, which, if unique
in scope, are not unique in form and content.
Footnotes
42
Advocating that NIMA develop a data-centric architecture rather than a
system-centric, product-centric or process-centric architecture may seem,
at first, to run counter to today's government and business practices.
Normally, one first determines the business processes critical to the
organization and then designs an information system to meet these. For
NIMA, though, information is the product.
43
With apologies to Bran Ferren.
44
It will be worth exploring whether, and to what extent, the MIDS-IDB database
administered by DIA should form the conceptual core of a new data-centric
architecture.
45
There were 283 products at last Commission count.
Foreword
| Executive Summary and Key Judgments
| Introduction | NIMA
from the Beginning
NIMA in Context | Two-and-a-Half
Roles for NIMA | The Promise of NIMA
NIMA and Its Stakeholders |
NIMA and Its "Customers" | Is There a "National
vs Tactical" Problem?
NIMA and Its Peers and Partners | NIMA
and Its Suppliers | NIMA Management Challenges
NIMA's Information Systems | NIMA
Research and Development
NIMA and Its Information Architecture | Recommendations
| Appendix A
Appendix B | Glossary
of Terms
Table
of Contents | Home | PDF
|