Subset Schemas

A subset schema is a customized version of a NIEM schema that contains only the properties, types, and codes that are needed for a particular information exchange, plus any of their required dependencies.

The 4.0 NIEM release includes 11,000 NIEM elements and over 150 files (the external standards, including GML, contribute to the file count). While its possible to use a full NIEM release package in an IEPD, it is typical for an IEPD developer to pick and choose only those release components that are needed by that exchange and to generate a much smaller subset of the release.

A subset schema must still conform to the NDR and cannot allow any content that is not permitted by the full corresponding NIEM schema.

The collection of individual subset schemas for a release are typically referred to as a NIEM subset. A subset will likely be much smaller than the corresponding full release.

Size

The size of a NIEM subset will vary based on how much of the NIEM model is being reused, but a subset can easily be narrowed down to a couple dozen or a couple hundred components across a dozen files. The reduced size and scope of a subset should improve validation and tool performance. It also makes the IEPD much easier to understand for other users.

Subset Schema

Subsets and IEPDs

IEPDs are self-contained packages, meaning that each one should contain all the files and information needed for implementation. To reuse components from a NIEM release, the IEPD needs to either contain that full set of release schemas or a subset of that release in order to provide the source of those NIEM components.

In an IEPD package, NIEM XML schemas are typically included in the base-xsd folder. The NIEM release or NIEM subset would be included in the niem folder.

# Partial IEPD package directory

iepd-name/
  base-xsd/
    niem/
      codes/
      domains/
      niem-core/
      utility/
    extension/

Extension schemas, which will contain user-created properties and types to represent requirements not found in NIEM, will be included in the extension folder. These extension schemas will import and reuse NIEM components form the niem folder.

If you are developing multiple related IEPDs, you can choose to build a custom subset for each IEPD or reuse a single combined subset that represents the full set of requirements in each IEPD.

Generating a Subset

Creating a subset manually can be time-consuming and error-prone. The Schema Subset Generation Tool (SSGT) can be used to search the NIEM data model, select components for a subset, calculate dependencies, and generate subset XML schemas.

Example subset of a NIEM type

The following shows the full PersonNameType from the 4.0 Core namespace:

  <xs:complexType name="PersonNameType">
    <xs:annotation>
      <xs:documentation>A data type for a combination of names and/or titles by which a person is known.</xs:documentation>
    </xs:annotation>
    <xs:complexContent>
      <xs:extension base="structures:ObjectType">
        <xs:sequence>
          <xs:element ref="nc:PersonNamePrefixText" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonGivenName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonMiddleName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonSurName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonNameSuffixText" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonMaidenName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonFullName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonNameCategoryAbstract" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonNameSalutationText" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonOfficialGivenName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonPreferredName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonSurNamePrefixText" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonNameAugmentationPoint" minOccurs="0" maxOccurs="unbounded"/>
        </xs:sequence>
        <xs:attribute ref="nc:personNameCommentText" use="optional"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

The following shows a subset of this type, generated by the SSGT, with only the given name, middle name, and surname included. The cardinality has also been restricted.

  <xs:complexType name="PersonNameType">
    <xs:annotation>
      <xs:documentation>A data type for a combination of names and/or titles by which a person is known.</xs:documentation>
    </xs:annotation>
    <xs:complexContent>
      <xs:extension base="structures:ObjectType">
        <xs:sequence>
          <xs:element ref="nc:PersonGivenName" minOccurs="1" maxOccurs="1"/>
          <xs:element ref="nc:PersonMiddleName" minOccurs="0" maxOccurs="unbounded"/>
          <xs:element ref="nc:PersonSurName" minOccurs="1" maxOccurs="1"/>
        </xs:sequence>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

Valid Subset Operations

The following describes a set of operations to consider when constructing subset schemas. It is possible to apply them in combinations that could break the subset relationship, or even result in invalid schemas. Apply these operations carefully and thoughtfully.

Limit cardinality

NIEM adopts an optional and over-inclusive strategy in order to support a broad user base with very different sets of needs. A subset for an IEPD represents a single set of exchange requirements and is the right place to tailor the cardinality as needed.

Cardinality can be adjusted as long as the new cardinality is permitted by the original release cardinality.

If an element in a type has cardinality (1, unbounded), then the following are examples of valid and invalid adjustments in a subset to the original cardinality.

These cardinalities fall within the original cardinality range:

  • (1, 1)
  • (2, 10)

These cardinalities fall outside the original cardinality range (0 was not permitted):

  • (0, 1)
  • (0, unbounded)

Element options:

  • Increase the value of an xs:element/@minOccurs as long as it remains less than or equal to its corresponding xs:element/@maxOccurs defined in the original schema.
  • Decrease the value of an xs:element/@maxOccurs as long as it remains greater than or equal to its corresponding xs:element/@minOccurs defined in the original schema.
  • Remove an xs:element with cardinality (xs:element/@minOccurs="0") from its type.

Attribute options:

  • Change an xs:attribute/@use="optional" to "required".
  • Change an xs:attribute/@use="optional" to "prohibited".
  • Remove the reference of an xs:attribute with @use="optional" from a type.

Remove components and items

Attributes, elements, and types can be removed from a subset if they are not being used by or required by other components in the subset.

  • Remove an unused xs:element declaration.
  • Remove an unused xs:complexType or xs:simpleType declaration.
  • Remove an unused element with representation term AugmentationPoint (these are specifically required for REF schemas in the NDR but can be removed from a subset).
  • Remove an unused xs:import statement.
  • Remove an unused file.
  • Remove a comment.
  • Remove an xs:annotation (definitions can be removed from a subset).
  • Remove an xs:enumeration from an xs:simpleType as long as it is not the only remaining xs:enumeration. (Removing the last enumeration would change the type to free text instead of a list of codes, which would not be permitted by the original schema.)

Other constraints

  • Add or apply a constraining facet to an xs:simpleType.
  • Change a concrete (non-abstract) xs:element declaration to xs:element/@abstract="true".
  • Change an xs:element/@nillable="true" to xs:element/@nillable="false".
  • Substitute one or more xs:element/@substitutionGroup members for its associated substitution group head.
  • Replace a wildcard with a composition, i.e. an ordered sequence of elements.