Version: September 29, 2016
Purpose of NIEM
The National Information Exchange Model (NIEM) is a joint partnership across all levels of government (federal, state, tribal, and local) as well as private industry. It is a uniform, harmonized, well-defined set of metadata structures normatively represented in the W3C XML Schema Definition (XSD) Language. We refer to the structures in this set as NIEM data components. They are openly available for use (and reuse) for designing, developing, and implementing standard XML-based information exchange specifications (IEPDs).
The primary purpose for the NIEM is to exchange information and data at the message level across different communities of interest. This does not limit its aptness for sharing information within a single community. However, it does drive its foundational design principles and rules. To achieve consensus for a large common vocabulary, with well-defined semantics, and consistent structure that enables numerous communities of interest to share information at the message level, requires a careful balance of rational constraints, flexibility, compromise, and collaboration. NIEM is designed to develop, distribute, and support large-scale enterprise-wide information exchange standards and associated procedures. In turn, this enables a variety of jurisdictions to automate information sharing during both emergencies and daily routine operations. Meeting this goal requires that all participating communities of interest model and maintain their NIEM data components according to NIEM specifications, even when to doing so may compromise long standing traditions within a community (for example, simple data names, terse definitions, or complex data structures). This is necessary (1) to ensure different communities can share message level information and its associated semantics accurately, and (2) to facilitate model consensus and continued scalability.
NIEM Domains
A NIEM domain is a Community of Interest (CoI) aligned around a common interest. For NIEM the common interest is generally based on a Line of Business (LoB) (or related group of such) for which information must be shared or exchanged.
A NIEM domain steward is responsible to ensure a given domain model (namespace or namespaces) is governed and maintained under the NIEM specifications. To do this effectively, the domain steward should commission team of subject matter experts with technical skills that collectively represent various levels of government for that CoI or LoB. However, the reality is that a domain is sometimes sponsored and governed by one organization because it may be difficult to identify volunteers from various levels of government within the CoI or LoB who can or will participate. These are usually the smaller domains with less content to govern.
Domains are a key aspect of the NIEM architecture for several reasons. NIEM was designed with one set of common Core data components governed by the NIEM Business Architecture Committee (NBAC), and multiple sets of domain data components (derived from the Core) each semi-independently governed by a domain technical team. By design, NIEM partitioning corresponds directly to its governance, and in turn, this facilitates both its version architecture and its ability to scale up (by onboarding new domains). Furthermore, almost every NBAC member is also a member of at least one domain team which provides significant cross-pollination for collaborating on issue resolution, harmonization, refactorization, refinement of data components during NIEM release development cycles. `:w
Finally, the NIEM architecture is flexible enough to allow the establishment of domains at-large that intentionally do not participate in governance, and yet, can still use NIEM to share information with member domains or other domains at large.
What this guide covers
This guide does not necessarily detail all topics it discusses. However, the guide often references other NIEM documents or Web pages for more information. It also includes an extensive appendix of references to other NIEM resources.
What you should be familiar with to use this guide
To use this guide, you should be somewhat familiar with the following topics and documents:
Audience
This document is intended for use by technical personnel who support NIEM domain modeling and harmonization during major and minor release cycles, including domain updates. In particular, we have tried to make it most useful to new or emerging domains that are building their initial content. Guidance in this document is useful to NIEM working level governance (NBAC, NTAC, PMO staff, and lead developer) who often evaluate the need to harmonize, refine, or refactor NIEM domain models. It may also be useful to organizations or groups interested in using NIEM but that do not intend to join its governance.
The NIEM High-Level Version Architecture (HLVA) 3.0 identifies processes, artifacts, governance responsibilities and how they interact to produce new releases of and updates to the NIEM model. The HLVA discusses issues with inherently conflicting tradeoffs including (but not limited to):
Under PMO direction and with technical assistance from the lead developer, the NBAC, NTAC, and domains execute NIEM governance at the operational level. Practicing the HLVA has demonstrated the following results:
How to name, define, and structure data components:
Identify data requirements
NIEM was designed to share information across domains (communities of interest or lines of business). You can certainly consider both internal and external data requirements, but you definitely should identify data requirements for sharing information with communities and organizations outside your own. Ask yourself who you share your information with outside your domain? Also, who from outside your domain shares information that you need?
It helps to identify or develop simple scenarios, and within those scenarios identify common use cases for sharing information. It also helps to examine existing database schemes, data dictionaries, XML schemas, flat files, paper/electronic forms, workflows, etc. for data requirements. Such data sources can provide insights into what data is currently shared and how.
There are likely many variances in data names and definitions that already exist in the sources. To create a good domain model it is necessary to harmonize, i.e., decide on a single name, definition, and structure (type) for each data element; eliminate duplication. Then map the data model elements (and types) back to their authoritative sources (data dictionaries, database schemes, forms, etc.) and record this mapping for reference. This mapping will likely become an critical resource to programmers who will implement information exchanges with the domain model and may have to trace back to the legacy data sources.
Do not boil the ocean
Model data components for real information requirements that are known to exist or that you know are necessary and are based on actual information sharing scenarios or use cases.
Do not create NIEM data components for every possible contingency or likelihood. Do not create components that might be nice-to-have or that are “likely” future requirements. That said, this does not mean you shouldn’t model new data requirements that are definitely in near term plans.
Note that it is important to consider real information exchange scenarios and associated use cases that will identify both the existing and new near term requirements. If possible, envision what the domain should look like in the (not too distant) future and build the to-be model from the as-is baseline.
Start small scale
Scale back the effort rather than create data components that may have to be deleted or changed in later release cycles, and will subsequently confuse your domain community. Try to ensure the key domain classes (object types) are present in the model, but it is not necessary to be absolutely complete. It is easy to add properties to a type later (in a release or domain update). Furthermore, NIEM types are easily extended with additional elements in IEPDs using augmentation points. IEPD extensions confirm the need for new requirements, and feed them back to the reference model for future additions to NIEM.
Grow your domain model over time as you build NIEM experience. NIEM allows you to publish a domain update anytime outside of the annual release cycle. So, there is always time to catch up and never a need to rush for NIEM’s sake.
Do not overbuild data components. Keep them simple: A type represents a real world object or concept. Elements describe the characteristics or parts of that object or concept. A complex type has elements, elements are typed, and so on down to primitive simple elements (of type string, text, name, date, amount, or token, etc.). For example:
All NIEM elements are defined as complexTypes that are extensions or ancestors of a complex base type in structures.xsd
. These base types contain a simple object attribute group that supports NIEM built-in capabilities such as metadata and referencing.
Code lists also contain the simple object attribute group; however, code lists get that attribute group somewhat differently from elements. Code lists require both a complex type and a simple type. Each code element is defined by an associated CodeType
(for example, EyeColorCode
is defined by EyeColorCodeType
). This CodeType
was derived from a CodeSimpleType
(following the example, EyeColorCodeSimpleType
). The CodeSimpleType
contains the code values as XML enumerations, while the CodeType
extends the CodeSimpleType
by adding the simple object attribute group, and the CodeType
becomes an XML complexType with simple content.
Map data requirements to NIEM
xml:lang
is to an element of type TextType
.Clearly identify data components that will require future harmonization
with the Core (or with other domains if inputting a domain update).
Use reference materials to model various NIEM techniques
During release cycle, provide initial input in NIEM-conformant
XML Schema document (XSD) or Change Request (XLS) format.
Domain data models vs. IEPDs
The NIEM Lead Developer (GTRI) can and will help you
NIEM Data Component Names and Definitions
A scalable vocabulary that will be used by many different communities to exchange information must be understandable to all parties involved. To facilitate consistency and understanding NIEM has established rules for naming and defining its data components. These rules apply to all types, elements, and attributes. They were derived from ISO/IEC Standard 11179, Information Technology – Metadata Registries (MDR). This standard has been around since the 1990’s and continues to be updated. For this reason, please do not expect that NIEM rules for names and definitions are exactly synchronous with Standard 11179. Yet, for the most part, the NIEM Naming and Design Rules (NDR) still generally follow 11179 rules and guidance for designing metadata definitions and names.
How to Draft Data Component Definitions
Each NIEM element, attribute, and type must be clearly defined before it will be accepted for a NIEM domain update or release. ISO/IEC Standard 11179 Part 4 is the guidance upon which NIEM definitions are formulated. The salient points of that guidance and the NIEM NDR rules are repeated here:
Each data component definition must be unique from all others and distinguishable in meaning. No two definitions can be identical in wording or so close in meaning that they could refer to the same data component.
Try to keep definitions simple and straight forward. This is not always possible, at least make them understandable to others who are not a part of your community of interest (i.e., domain).
Element definitions almost always begin with an indefinite article (i.e., “a” or “an”), never a definite article (i.e., “the”).
Since it is often the case that a type and an element of that type can be defined with identical or similar words (for example, Person and PersonType), it is a NIEM best practice to begin a type definition with the phrase “A data type for …” This ensures that the definition for the element and its associated type are easily distinguishable.
If you have trouble designing a good definition for a data component, refer to the current NIEM release for examples.
Avoid using the terms in the name of a data component to define it. That said, a good data component name may be self-defining. If there are no good synonyms to employee in the definition, and you must use one or more terms from the name, it is not an error.
Aside from the typing implications of opening phrases mentioned above, do not put data typing information in a data component definition. An example of a bad definition is: SocialSecurityID
- “A 9-digit number with hyphens that identifies a person in the U.S.” The fact this element is a “9-digit number with hyphens” should not be included in its definition. In most cases separators are meaningless and unnecessary visual aids for human readability. The correct way to define this identifier is to use the xs:pattern
attribute to restrict it to nine digits and without hyphens. Note that non-alphanumeric characters may be used within identifiers if they are an integral part of the identifier itself (for example, passwords), and are NOT merely being used as visual separators.
Based on the foregoing, it is good practice to avoid use of the word “type” within definitions, because in most cases,”type” refers to data typing. Instead, in the appropriate cases, a definition should use terms such as kinds, class, category, nature, genre, or form to refer to classifications (another relatively common word-sense of “type”).
Code list type definitions — A code list must have a definition for both its associated CodeType
and CodeSimpleType
. Both of these data types can have the same definition (one of the few exceptions to the unique definition rule) since they are semantically the same type. The difference is that the complex type extends the simple type to add several common properties that are part of the NIEM infrastructure. The definition should NOT refer to the code values or the code literals. For example, the definition for DayOfWeekCodeSimpleType
could be “A code for a day in a week”. It should NOT include Su=Sunday, Mo=Monday, etc. These values and associated literals will be recorded in the xs:enumeration
elements within the XSD for the CodeSimpleType
.
Type definitions should describe what a type is, not list and define its contents. Describe it as an object not a container of attributes. For example, VehicleType
:
VehicleType
— (bad definition) A data type that contains the following (properties or characteristics) VehicleColorInteriorText, VehicleDoorQuantity, VehicleIdentification, VehicleMake, VehicleModel, …VehicleType
— (good definition) A data type for a means of ground transportation designed to carry an operator, passengers, and/or cargo.Neoskizzle
:
Neoskizzle
container containing many references to elements as defined in NeoskizzleType
.Neoskizzle
contains identification information as well as many other characteristics. See type definition.Neoskizzle
containing necessary data elements.Neoskizzle
related information.At this point, do you have any idea what “neoskizzle” means or is by reading any one of the definitions above? Of course not. So, here is an example of a good definition: A person who takes part in an event, activity, meeting, or other social function. Apparently, a synonym for “neoskizzle” is “participant”. So, why not use the term “Participant” for the element name? (By the way, in case you hadn’t guessed, there is no such word as “neoskizzle”. It’s made up.)
How to Design Data Component Names
The NIEM NDR provides fairly clear rules and guidance regarding the naming of data components. The most important NDR sections for understanding NIEM data naming are:
Summary of naming syntax
The syntax of NIEM data names comes from IEC/ISO 11179 Part 5.
A single NIEM data name may consist of a number of terms. A term is a meaningful word, an abbreviation for a word, or an acronym. Word terms are one of:
ID
, the authorized abbreviation for Identifier
.URI
, the authorized abbreviation for Uniform Resource Identifier
.In accordance with IEC/ISO 11179 Part 5, terms that make up a NIEM data name are classified into four basic parts according to their placement and function. A term in a data name can be one of these:
Example: VehicleTrafficControlDeviceCategoryCode
— A data type for a kind of traffic control device (TCD) applicable to this motor vehicle at the crash location.
Vehicle
= object termTraffic
, Control
, Device
= all qualifier terms modifying Category
Category
= property termCode
= representation termOther naming guidance
Avoid use of the term Type
(except as a representation term to identify the name of a NIEM type). This is reserved for the representation term indicating data typing; instead use Category
. Because it is an extremely common concept in all domains, the only current exception is BloodType
.
Avoid use of the term Number
in data names. Its use is usually too generic to be meaningful or helpful, unless a name is extremely common across all domains (not just a few). For example, the data name TelephoneNumber
is common across all domains; in this case, it is most useful for clear meaning. Otherwise, a number should usually be an ID (Identifier), Quantity, Numeric (or Value), Amount, Measure, or Duration (of time) (See: https://reference.niem.gov/niem/specification/naming-and-design-rules/3.0/niem-ndr-3.0.html#section_10.8.7).
Do NOT use double terms (i.e., consecutive identical terms such TypeType or NameName) unless such a term has very specific meaning. Double terms should be replaced with a single instance of the term (as long as it does not detract from the real meaning).
ID (Identifier) vs. Identification — ID
and Identification
elements are easy to confuse. An ID
is a string element that uniquely identifies an entity; so, an ID
has simple content. An Identification
element is a set of sub-elements. For example, an Identification
element for a person usually has sub-elements such as PersonName
, PersonHeight
, PersonWeight
, PersonEyeColor
, PersonHairColor
IssueDate
, ExpirationDate
, etc.; so, an Identification
element has complex content (i.e., sub-elements). Usually one or more of its sub-elements will be ID
elements.
Text(Type)
vs. Name(Type)
— In the construction of NIEM element names, Name
and Text
are authorized representation terms (of type NameType
and TextType
respectively). The term Name
is a word or phrase that constitutes the distinctive designation of and applies to a specific person, place, thing or concept. This is not necessarily an identifier, for example, there are multiple persons with the name “Bob”. Text
is a word or phrase in some language (usually English). A
Date(Type)
— The only date and time format supported by W3C XML Schema is a subset of ISO 8601. NIEM uses this for date and time.
Indicator(Type)
— This term designates NIEM Booleans whose valid values are TRUE or FALSE. Do NOT change these values to YES/NO or 1/0. If required, translation to and from TRUE or FALSE is not difficult.
Best examples of data names are in the current NIEM releases, in particular NIEM Core niem-core.xsd v3.0
, or the Core from the most current major release.
How to search for NIEM data components:
Although this section does not address NIEM modeling specifically, given the large number of NIEM data components, most modeling efforts will require searching the NIEM model for content to reuse, duplication, examples of content models, points of extension, etc. This section introduces some basic methods of searching for NIEM content.
The Schema Subset Generation Tool (SSGT) is one way to search for NIEM data components. This tool was designed to build and generate NIEM schema subsets. However, SSGT also contains a convenient search engine that can be used alone.
Basic Search
SSGT provides a full text search capability for a NIEM release. You can enter a search string into the criteria box and view a list of all NIEM data components that contain a lexical match to that string. The default is to search all data component text records in a single release, including names, definitions, and other metadata text fields. Namespaces are not part of the standard search. This is because namespace prefixes are extremely prevalent throughout the model and would tend to clutter most search results. As will be described later, the user can configure SSGT to search in particular ways.
Search is case independent. A search for sonty
will return the data component PersonType
.
SSGT can only search a single release (Later, we discuss a technique for searching multiple releases at once). The default is to search the most recent (current) operational release. If you need to search a previous NIEM release you must select it from the “options” menu.
SSGT always requires that you declare the class of the data component you are looking for, and only that class will be in the scope of the search. This means you must declare one of the following in the dropdown box:
Search results are displayed in a list of NIEM data component qualified names, i.e., QName syntax: namespace prefix, followed by colon character, followed by component name; for example, nc:PersonFullName
.
Depending on the outcome of a simple (default) search, a resulting list will have as many as three parts in this order:
Each part of the search result (if it exists) is sorted alphabetically by Qname, which includes the namespace prefix so that component names are conveniently grouped by namespace.
SSGT does not indicate the two points (if either exists) at which these parts are joined. The only way to identify these break points is by visually inspecting the result list to find the locations where alphabetic ordering of names restarts.
Each data name is hyperlinked to its associated metadata and relationships. For an element this includes its definition, keywords (synonyms, if any), usage information (if any), usage examples (if any), its own type, and the types that use this element. For a type this includes its definition, content style, elements it contains, elements that are of this type, and base types from which it is derived. For convenience there are many features that help with identification of data components in the display. Property names (XML element and attribute names) are displayed in blue, and XML type names are displayed in red. XML element and attribute names are usually displayed with their associated XML type names. Hovering over a data component name will reveal its definition. A “details” link to the right of a property name and associated type name will reveal metadata about that property.
Note for MS Internet Explorer 11 users: This browser must be reconfigured to “compatibility mode” (not its default) for SSGT to work correctly:
Advanced Search
In general, the SSGT advanced search capability provides additional constraints that limit the scope of a search.
Click Show Advanced
to open the SSGT advanced search capabilities. Here you can constrain the search criteria:
or any combination of these constraints.
You can also constrain search to particular metadata fields that you select. These are Names, Definitions, Keywords (synonyms), Usage (descriptions of how a data component is commonly used), and Example Content (samples of valid values for the element).
You can limit a search to one or more domains by checking/unchecking the appropriate boxes. If you want to search content outside of all domains (for example, NIEM Core and all code lists), then just check “other”. If you uncheck all boxes the tool will go back to searching the entire model (same as if all boxes are checked).
Multiple search criteria
You can enter multiple search terms separated by spaces in the search criteria box. In this case, SSGT search will return data component records that contain ALL those terms in any order.
Wildcard search
SSGT also has one simple wildcard character, an asterisk. It can be used as many times as desired within the search criteria box. For example, the following search criteria:
son\*nor\*sim\*yp
will return:
aamva\_d20:PersonOrganDonorCodeSimpleType
because this criteria matches this element name. A single asterisk in the search criteria box will return all data components in the release. (NOTE: After a wildcard search SSGT replaces each asterisk in the criteria with a percent character.)
Movement enables you to search and explore the content of the NIEM model. It provides a user friendly interface, and smart search results. It is also open source: the community who inspired its creation can contribute to it.
See more information about Movement and visit the tool itself.
For anyone relatively new to NIEM, the Model Spreadsheet is an easier way to search, navigate, and explore the model, and familiarize with NIEM content and structure relatively quickly.
Each NIEM release package contains a model spreadsheet that opens with Microsoft Excel. If you download and unzip a NIEM release package at https://release.niem.gov/ the spreadsheet will be niem-m.n.xlsx
where m
is the major (series) version integer, and n
is the minor version integer.
The model spreadsheet is organized into the following tabs:
Each tab contains a list that also includes various metadata about the data components listed. Each type is hyperlinked to the location of its definition. The model spreadsheet is designed to explore types and their associated properties (sub-elements).
Excel menu Find
and Search
functions also take wildcards. Asterisk and question mark characters are used for multi- and single-character wildcards respectively. To find asterisks and question marks in the text, you must escape these characters with a tilde (~
). So, ~*
and ~?
will find asterisks and question marks. To escape the tilde character itself, enter ~~
.
Excel offers other kinds of wildcards and features that provide search capability similar to regular expressions. PC users might look at https://support.office.com/en-us/article/Find-and-replace-text-by-using-regular-expressions-Advanced-eeaa03b0-e9f3-4921-b1e8-85b0ad1c427f#__toc286834987. It would be nice in both PC and Mac Excel had identical user interfaces, but that is not the case. So, for Mac users should check out http://searchengineland.com/advanced-filters-excels-amazing alternative-to-regex-143680 (and beware, shortcuts for Mac Excel 2011 don’t always seem to work as expected).
Offline
Because NIEM is XML-based, most methods of searching NIEM are text oriented. Almost any good XML tool can be useful. Common XML editors that can do this are Altova XMLSpy and oXygen.
Other methods of searching NIEM use full text search capabilities that can quickly scan through large file sets. NIEM schemas contain an entire textual data dictionary. Multiple NIEM releases can be searched simultaneously using some version of grep (originally a Unix utility and available natively on an Apple MacIntosh; there are also version that run under Microsoft Windows). Grep is simple, fast, and facilitates user customization. By downloading all the NIEM release packages and placing their XML schemas into a single directory, grep can be customized to find and extract any text that can be described by a grep expression.
Online
Another method of searching NIEM schemas is through a Google site search. All NIEM component names and definitions are online in open text, both in their XSD form as well as XHTML. For example, https://release.niem.gov/niem/3.2/ is a Web page that indexes all the NIEM 3.2 schema documents. Furthermore, at https://release.niem.gov/ are all the NIEM release products since 2006. Each release is documented in HTML pages rooted at these links:
You can browse these pages that contain the contents of each NIEM release organized by namespace. The data components in each namespace are listed alphabetically by name, and each name is linked to a page that displays its definition and other metadata. Each namespace page also contains a direct link to its associated schema document.
You can also search both the release schemas and HTML documentation pages with Google. To do this, run a Google site search on data component names and definitions in the release schemas (XSD) or the namespace documentation pages (XHTML).
For example, in your browser address field, enter:
site:release.niem.gov/niem/ PersonType
Google search results will contain links to any Web page within the release.niem.gov/niem/ directory that contains a match to PersonType
.
You may enter multiple search terms. For example, enter:
site:release.niem.gov/niem/ PersonType location
Search results will return links to pages that contain both PersonType
and location
in any order.
Character case is ignored.
Note that as an Internet search engine, Google generally only indexes full words (tokens surrounded by space or delimiters).
Google also uses “stemming” (recommend you Google that) and other heuristics to find pages with words related to search terms.
So, while SSGT can NOT find substrings within words (i.e., at the lexical level) that match search terms (such as sonty
in the earlier example), Google can find partial matches to sets of words in textual paragraphs, sentences, and phrases.
For example, if you enter:
site:release.niem.gov/niem/ Pers
Google search returns pages that contain the string pers
bound by spaces or other delimiters/separators (i.e., pers
as a word), but it does not return pages that contain Person
or PersonType
because it only indexes words (which are bound by delimiter or separator characters; such as spaces, quotes, and other special characters).
Despite this limitation, Google has a number of powerful operators that can be applied to perform advanced site searches on NIEM schemas and documentation pages. See Google Advanced Search Operators.
This section describes how to map a database, model, dictionary, etc. to NIEM.
Note that this section is NOT meant to be an in-depth treatise on mapping. It is to provide an overview, some guidance, and suggestions for approaching this task. Mapping is usually hard and can be tedious, but it’s also necessary for making a legacy system work with NIEM.
Purpose of mapping
niem adopters generally do not start without pre-existing data requirements. they often begin with existing legacy metadata associated with a database or data model. this metadata usually contains most or all of their data requirements. niem is designed for sharing information with organizations and communities outside their sphere of influence. so, the primary targets for mapping to niem are those data requirements that will be shared externally. That said, a NIEM adopter may choose to map all his/her metadata to NIEM; however, this is not necessary.
Mapping is an important step to NIEM adoption because it:
In computing and data management, data mapping is the process of identifying/recording semantic similarity between data elements of different data models or schemas. Data mapping is used as a first step for a wide variety of data integration tasks that often include data transformation or data mediation between a data source and a destination.
A mapping from a source (model, schema, database, etc.) to NIEM will have three functionally useful parts:
Although heuristics can be applied, few, if any, tools exist that automatically, accurately, and completely map between data models. There are far too many ways to represent and define data, and everyone has a different perspectives. A few tools exist that provide some degree of intelligent (often heuristic-based) help by “suggesting” possible matches. However, mapping between different data models, databases, or data schemas will still require tedious human review. Most tools that support mapping, help with the organizational and recording aspects. A spreadsheet is one of the simplest tools for documenting a map between a source (database, model, etc.) and NIEM. The basic concept of mapping is to match (and align) appropriate metadata components from the source to semantically equivalent or nearly equivalent components in NIEM.
Referring to https://en.wikipedia.org/wiki/Schema_matching, two terms are used to describe data mapping and are often used interchangeably: schema matching and mapping.
This NIEM guidance will use the term mapping to refer to both, because both schema matching and mapping can be combined into a single document.
Given a spreadsheet paradigm, a row identifies a semantic association (i.e., a match) between a source component and a NIEM component, and describes the nature of that association.
There are generally three possible outcomes for an association: exact, partial, or none. An exact match means that the components are semantically equivalent (or close enough). At the other extreme, no match (none) means that there are no NIEM components that can match or even partially match the source component. In this case, there will be no NIEM component identified with this record (until one is designed). This indicates a gap in the mapping for which a new NIEM component is required; that component will be designed (and proposed) later as part of the NIEM adoption and on-boarding process for new domains. And of course, somewhere between an exact match and no match lies a partial match.
Various types of information may be used to evaluate a match, including (but not limited to):
At present, a small number of NIEM components also include synonym, usage, and sample value information that may be helpful. This information can be found within SSGT (when it exists). The NBAC and Domains are always welcomed and encouraged to suggest additions to this kind of information.
Thus, for a given component of the source model, a matching generally requires three basic parts:
Source model component | NIEM component | Description of mapping |
Name | Name | Nature of semantic match (one of): |
Definition | Definition | - exact match |
Datatype | Datatype | - partial match |
etc. | etc. | - none (missing in NIEM) |
For convenience and completeness mappings usually incorporate any additional metadata associated with each source and NIEM component (such as definition, datatype, cardinality, etc.). Also, a column for general comments/notes (not shown above) is incorporated into most mapping documents to ensure clarity and to record rationales for difficult associations.
While many strategies may exist for performing mapping between models and/or schemas, most of these can be classified into one of two general approaches: top down and bottom up.
Our own experience with mapping to NIEM indicates that a top down approach works best. This is probably because NIEM tends to employ a top down structure for data representations. NIEM is object-oriented, and therefore, models real world entities and concepts as collections of properties. The objects/concepts are usually complex and often the properties for such are also complex. So, for example a Person is a complexType, PersonType. PersonType has properties such as PersonBirthDate, PersonHairColorCode, PersonEyeColorCode, and PersonName. While the first three properties mentioned are simpleTypes, PersonName is type PersonNameType which is a complexType. PersonNameType breaks down into several simpleType elements representing the various parts of a person’s name (such as first, middle, last, etc.).
Using a top down approach, identify the entities represented in your own model (or schemas) and match them to the same objects/concepts in NIEM (if they exist). There will likely be differences in the properties contained in each pairing. However, by starting with the top level entities you will likely narrow your search for matching properties at the atomic level. Map objects and their properties based on semantics not datatype.
Even if semantics of two properties are identical, datatypes will not always be the same. For example, you may determine that the source model represents a person’s name as a simple string, Name
that concatenates first, middle, and last names.
NIEM represents a person’s name as a complex element, PersonName
, consisting of three separate string elements PersonGivenName
(first), PersonMiddleName
, PersonSurName
(last).
Each of these is only a partial match to the source element for a person’s name.
However, in this example, Name
is an exact match to PersonName
, and the transformation from one to the other (or reverse) is a simple set of operations that involve either concatenation or parsing depending on the direction of the transform.
If an object or concept in your model does not have an equivalent or similar entity in NIEM, flag it as a potential shortfall. Later you will build this object into NIEM as part of your new domain model (to be discuss later). Likewise, when comparing atomic level object/concept properties you should also flag properties in your model that are not represented in NIEM.
Exact matches and no matches are fairly easy to handle. An “exact match” reuses a NIEM component as-is, and a “no match” generates a new NIEM component.
However, partial semantic matches tend to be the norm. Sometimes they can be resolved by breaking down complex types into constituent components and subsequently mapping to them. However, most schemes and models for data representation have many dimensions that can complicate what would otherwise appear to be an exact or near exact match, including names, semantics, stucture, constraints, datatype, usage, and others. For this reason, sometimes treating a partial match as “no match” (and subsequently designing new NIEM components) may be more practical and productive.
[ More to follow, including examples, Change Request, Component Mapping Template ]
Considerations in identifying, designing, defining, and structuring data components
Questions to ask or consider about metadata, semantics, and structure while modeling NIEM content. These can help you decide what to model, how, and what NIEM structures to employ or consider.
For each potential NIEM Core data component (type/property):
If you are a domain representative, an NBAC member, or an NTAC member you may want to register for one or more of the following resources in order to participate more fully in NIEM governance and its associated processes.
If you are a domain representative developing NIEM content or maintaining your domain namespace, or if you are an NBAC or NTAC member, then please talk directly to the lead developer for technical assistance. This is your privilege as a member of NIEM governance. The lead developer (GTRI) participates in all NIEM governance meetings, so please don’t hesitate to ask the lead developer for technical help. They will be happy to schedule a telephone appointment to talk with you. Asking questions, making suggestions or recommendations through these resources are more for your constituents. You can use these resources too, but for technical assistance it is usually faster to contact the lead developer directly.
URLs to releases, technical references, tools, templates, and other aids
Releases | URL | Notes |
---|---|---|
All NIEM release packages | https://release.niem.gov/ | |
NIEM domain updates, core supplements | https://publication.niem.gov/ | |
NIEM 5.0 release | https://release.niem.gov/niem/5.0/ |
Reference Tools | URL/Email | Notes |
---|---|---|
NIEM online reference tools | https://tools.niem.gov/ | |
Schema Subset Generation Tool (SSGT) | https://tools.niem.gov/niemtools/ssgt/index.iepd | Search, XML subsets |
Conformance Testing Assistant (ConTesA) | https://tools.niem.gov/contesa/ | |
Code List Schema Generator | https://tools.niem.gov/niemtools/codelist/index.iepd | Legacy; for NIEM 2.0 |
NIEM GitHub | https://niem.github.io/ | |
NIEM Releases Issue Tracker | https://github.com/niem/niem-releases/issues | For content-related issues |
Other Tools | URL/Email | Notes |
---|---|---|
Wayfarer (online; by Tom Carlson) | http://niemwayfarer.org/ | search NIEM |
Wayfarer (for download; by Tom Carlson) | http://niemwayfarer.org/tools/niemsaw/ | search NIEM |
Oracle Content Assembly Mechanism (CAM) | https://sourceforge.net/projects/camprocessor/ | XML editor, validator, designer |
MITRE Openii | http://openii.sourceforge.net/ | information integration tool suite |
Altova XMLSpy | http://www.altova.com/xmlspy.html | XML editor (not free) |
Altova NIEM solutions | http://www.altova.com/solutions/niem.html | other Altova support for NIEM |
Oxygen XML Editor | https://www.oxygenxml.com/ | XML editor (not free) |
Other Resources | URL | Notes |
---|---|---|
NIEM home | https://www.niem.gov/ | |
Change request | https://reference.niem.gov/niem/resource/change-request/ | |
IEPD Clearinghouse | https://www.it.ojp.gov/niss/ | mostly older IEPDs |
ISO/IEC Standard 11179 Information Technology – Metadata Registries (MDR) | http://metadata-standards.org/11179/ | guidance for names and definitions |
This appendix defines terminology associated with NIEM. The first section defines common terms that are not formally defined in NIEM specifications. Subsequent sections reference the definitions for terms used in the NIEM specifications. Each list is sorted alphabetically.
Common NIEM terms
nc:PersonSSNIdentification
consists of three terms: Person
, SSN
(an acronym for Social Security Number), and Identification
.