DHS DCAT v3 Extension Terms

As part of the DHS Data Inventory effort launched in the fall of 2021, a set of terms were developed to extend the DCAT v3 terms and provide greater detail and fidelity to the DHS Data Inventory Program, specifically the Data Inventory Record. The following list of terms are linked to their more detailed definitions (provided in the section below).

http://github.com/usdhs/dcat-tool/0.1#highLevel

Data Catalog Record Access Level  Component  Restriction Reason 

http://github.com/usdhs/dcat-tool/0.1#dataGovernanceInformation

Governance  Owner  Steward  Custodian  FISMA ID  Sharing Agreements  Collection Authority  Release Authority  Records Schedule  PTA Adjudicated Date 

http://github.com/usdhs/dcat-tool/0.1#standards

Conforms FIPS  Conforms NIEM Percent  Conforms Unicode  Identities NativeScript  Transliteration Standard 

http://github.com/usdhs/dcat-tool/0.1#provenance

Source Datasets  Destination Datasets  Is Open Source  Is Commercial  Has Data Dictionary 

http://github.com/usdhs/dcat-tool/0.1#dataQuality

Data Quality Known  Data Quality Percent  Data Quality Assessment 

http://github.com/usdhs/dcat-tool/0.1#distribution

Access URL NIEM  Vendor  Functional Data Domain  Is Stream  Access Instructions 

http://github.com/usdhs/dcat-tool/0.1#size

Table Count  Record Count 

http://github.com/usdhs/dcat-tool/0.1#security

Dataset Classification  Characteristics - Person Level  Characteristics - Financial  Characteristics - Event records  Characteristics - Faces  Characteristics - Fingerprints  Characteristics - CUI  Characteristics - PHI  Characteristics - PII  Characteristics - Geospatial  Characteristics - Environmental  Characteristics - FISA  Characteristics - 8usc1367  Characteristics - proprietary info  Characteristics - Immigration  Characteristics - Critical Infrastructure  Characteristics - PCII - Protected Critical Infrastructure Information  Characteristics - biometrics  Characteristics - Dissemination Restrictions  Characteristics - LES - Law Enforcement Sensative  Characteristics - Synthetic  Characteristics - Anonymized 

http://github.com/usdhs/dcat-tool/0.1#location

Hosting Location  Hosted in Cloud  Easily Accessible By Creating Component  Easily Accessible By All Components  Easily Accessible By General Public 

http://github.com/usdhs/dcat-tool/0.1#publication

Record Transmission  Validity Time 
Detailed definitions follow:

Data Catalog Record Access Level

Term Name: dhs:dataCatalogRecordAccessLevel   return to top
Description

The classification of this Data Inventory Record as either Public or Non-Public.

Can this record be published in data.gov?

  • Yes = Public
  • No = Non-Public

Namespace http://github.com/usdhs/dcat-tool/
Required Yes
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Component

Term Name: dhs:component   return to top
Description

The DHS Component (sub-agency) the data pertains too.

  • MGMT
  • CISA
  • CBP
  • FEMA
  • S&T
  • TSA
  • ICE
  • USCIS
  • FPS
  • FLETC
  • USSS
  • USCG

Namespace http://github.com/usdhs/dcat-tool/
Required Yes
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Restriction Reason

Term Name: dhs:restrictionReason   return to top
Description

A reason for which the dataset is not published to data.gov (I.e. not marked as 'public' for accessLevel and not marked as 'public' for dataCatalogRecordAccessLevel). Must be provided as a comma separated list including one or more of the following:

  • PII Sensitive
  • Security Risk
  • Legal Liability
  • Intellectual Property Rights
  • Confidential Business Information
  • Restricted by Contract
  • FOIA Exemption
  • OMB Director Discretion

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Governance

Term Name: dhs:governance   return to top
Description

Either a name or email (individual or group) of the individual responsible for overseeing the information contained in the resource. Include the point of contact information with either a name or email (individual or group) when submitting via Excel or a full Vcard when submitting via JSON.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Owner

Term Name: dhs:owner   return to top
Description

Either a name or email (individual or group) of the individual responsible for the accuracy of the information contained in the resource when submitting via Excel or a full Vcard when submitting via JSON.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Steward

Term Name: dhs:steward   return to top
Description

Either a name or email (individual or group) of the administrator of the dataset who ensures the information is properly stored, maintained, accessible, and protected. If submitting via JSON, a full Vcard.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Custodian

Term Name: dhs:custodian   return to top
Description

Either a name or email (individual or group) of the individual who has physical possession of the information. If submitting via JSON, a full Vcard.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

FISMA ID

Term Name: dhs:fismaID   return to top
Description

The FISMA ID is a unique identifier that describes various key characteristics of a specific system.

If the dataset is part of a system that has a FISMA ID then the FISMA ID should be provided.

Example:

  • FSA-00100-MAJ-00100
  • - OR -
  • TSA-02379-SUB-00465

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Sharing Agreements

Term Name: dhs:sharingAgreements   return to top
Description

A contract that states what and how information can be shared or utilized.

Example:

  • https://www.dhs.gov/sites/default/files/publications/privacy_crcl_guidance_ise_2009-01_0.pdf

DHS

  • https://catalog.data.gov/dataset/fingerprint

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Collection Authority

Term Name: dhs:collectionAuthority   return to top
Description

The legislation or executive order under which the data was collected.

The specific document or policy that grants you permission to collect specific data.

Example:

  • https://www.ecfr.gov/current/title-28/chapter-I/part-28/subpart-B/section-28.12

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Release Authority

Term Name: dhs:releaseAuthority   return to top
Description

The document, legislation or executive order under which he data can be released.

Examples:

  • https://www.ecfr.gov/current/title-6/chapter-I/part-7/subpart-B/section-7.23
  • https://irp.fas.org/dni/icd/icd-403.pdf

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Records Schedule

Term Name: dhs:recordsSchedule   return to top
Description

The policy indicating the time period in which the records are retained.

Example:

  • DAA-0563-2013-0005 (link below)

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

PTA Adjudicated Date

Term Name: dhs:ptaAdjudicatedDate   return to top
Description

The date on which the Privacy Threshold Assessment (PTA) was adjudicated

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:date

Conforms FIPS

Term Name: dhs:conformsFIPS   return to top
Description

The Federal Inventory Processing Standard that the dataset conforms to, if any.

Example:

  • FIPS201-3
  • https://csrc.nist.gov/publications/detail/fips/201/3/final

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Conforms NIEM Percent

Term Name: dhs:conformsNIEMPercent   return to top
Description

The numerical percentage (0-100) of the dataset that is NIEM compliant.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:integer

Conforms Unicode

Term Name: dhs:conformsUnicode   return to top
Description

True or False

Is the information in the resource written in Unicode (does the code given to each character start with the letter 'u')?

Information containing foreign languages/characters are typically written in Unicode format (U+XXX).

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Identities NativeScript

Term Name: dhs:identitiesNativeScript   return to top
Description

True or False

Are the names of individuals and other entities stored in a way (e.g. Roman characters) that can be shared across systems?

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Transliteration Standard

Term Name: dhs:transliterationStandard   return to top
Description

The standard of converting text from one language to Roman characters.

Example:

  • https://www.iso.org/ics/01.140.10/x/

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Source Datasets

Term Name: dhs:sourceDatasets   return to top
Description

An identifier for the originating dataset (the DHS Unique Identifier if possible or DOI or URL for externally developed datasets).

Example:

  • Mobius - creates TRM Data which TRM team uses for various projects.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Destination Datasets

Term Name: dhs:destinationDatasets   return to top
Description

The unique identifier of downstream datasets fed by this dataset.

Example:

  • In this instance, Mobius would include the unique identifier for downstream datasets.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Is Open Source

Term Name: dhs:isOpenSource   return to top
Description

True or False

The data in the dataset is composed of publicly available data and maybe compiled from multiple sources.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Is Commercial

Term Name: dhs:isCommercial   return to top
Description

True or False

The dataset is aquired from a private sector provider (may be purchased or acquired via agreement).

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Has Data Dictionary

Term Name: dhs:hasDataDictionary   return to top
Description

True or False

The dataset has a data dictionary (should be pointed to in the referencedBy attribute).

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Data Quality Known

Term Name: dhs:dataQualityKnown   return to top
Description

True or False

Is the quality of the information in the resource known?

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Data Quality Percent

Term Name: dhs:dataQualityPercent   return to top
Description

A simple measure of the data quality, on a scale of 0-100.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:integer

Data Quality Assessment

Term Name: dhs:dataQualityAssessment   return to top
Description

A written account evaluating the quality of the information contained within the resource.

Example:

  • This dataset has some gaps in critical areas. Efforts are underway to remediate.
  •  
  • This dataset is based on information obtained in 2017. Efforts underway to obtain current data.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Access URL NIEM

Term Name: dhs:accessURLNIEM   return to top
Description

The landing page (URL) that provides you access to the NIEM interface.

Must be a URL.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:anyURI

Vendor

Term Name: dhs:vendor   return to top
Description The vendor, supplier, or set of suppliers who supplied the information contained in the resource.

Example:

  • ECS
  • ECS, GeoMgmt Inc.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Functional Data Domain

Term Name: dhs:functionalDataDomain   return to top
Description

The functional subject area of the resource.

List:

  • Biometrics
  • CBRN
  • Cybersecurity
  • Emergency Mgmt. Geospatial
  • Immigration
  • Infra. Protection
  • Intelligence
  • International Trade
  • Law Enforcement
  • Maritime
  • Mission Support & Mgmt.
  • Screening

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Is Stream

Term Name: dhs:isStream   return to top
Description

True or False

Data Set or Data Source is streaming. The data is constantly updated.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Access Instructions

Term Name: dhs:accessInstructions   return to top
Description

The steps or actions an individual must take to gain access to the dataset. Can work in conjunction with accessRights (max length 5,000 characters).

Example: 'Visit the (accessURL) link and then click on 'Your ArcGIS organization's URL and enter and click Continue'

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Table Count

Term Name: dhs:tableCount   return to top
Description

The number of 2-dimensional tables in the data.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:integer

Record Count

Term Name: dhs:recordCount   return to top
Description

The total number of records in the data.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:integer

Dataset Classification

Term Name: dhs:datasetClassification   return to top
Description

The security classification of the data.

Example:

  • CUI
  • Confidential
  • Secret
  • Top Secret

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Characteristics - Person Level

Term Name: dhs:ch-person-level   return to top
Description

True or False

The information in the resource contains enough personal records or enough personal information in order to identify an individual.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Financial

Term Name: dhs:ch-financial   return to top
Description

True or False

The information in the resource contains financial information.

Example:

  • Bank Statements
  • Credit Reports
  • W2
  • Net Worth
  • Bank Secrecy

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Event records

Term Name: dhs:ch-event-records   return to top
Description

True or False

The information in the resource contains information about events which are tagged with a specific place and time.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Faces

Term Name: dhs:ch-faces   return to top
Description

True or False

The information in the resource contains facial recognition data or images of human faces.

Example:

  • https://catalog.data.gov/dataset/global-entry-master-dataset

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Fingerprints

Term Name: dhs:ch-fingerprints   return to top
Description

True or False

The information in the resource contains fingerprint data or images of fingerprints.

Example:

  • https://catalog.data.gov/dataset/global-entry-master-dataset

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - CUI

Term Name: dhs:ch-cui   return to top
Description

True or False

The information in the resource contains Controlled Unclassified Information (CUI).

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - PHI

Term Name: dhs:ch-phi   return to top
Description

True or False

The information in the resource contains Protected Health Information (PHI).

Examples:

  • Medical Records
  • Diagnoses
  • Medicul Results

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - PII

Term Name: dhs:ch-pii   return to top
Description

True or False

The information in the resource contains Personally Identifiable Information (PII).

Reference: https://www.dhs.gov/sites/default/files/publications/dhs%20policy%20directive%20047-01-007%20handbook%20for%20safeguarding%20sensitive%20PII%2012-4-2017.pdf

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Geospatial

Term Name: dhs:ch-geospatial   return to top
Description

True or False

The information in the resource contains geospatial information.

Example:

  • https://catalog.data.gov/dataset/fema-historical-disaster-declarations-shp
  • https://catalog.data.gov/dataset/formerly-used-defense-sites-fuds-public-properties

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Environmental

Term Name: dhs:ch-environmental   return to top
Description

True or False

The information in the resource contains environmental information.

Example:

  • https://catalog.data.gov/dataset/environmental-planning-historic-preservation-2dedc

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - FISA

Term Name: dhs:ch-fisa   return to top
Description

True or False

The information in the resource contains information pertaining to the Foreign Intelligence Surveillance Act (FISA).

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - 8usc1367

Term Name: dhs:ch-8usc1367   return to top
Description

True or False

The information in the resource contains 8 USC 1367.

Reference: https://www.dhs.gov/sites/default/files/publications/dhs_foia_instruction_section_1367_information.pdf

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - proprietary info

Term Name: dhs:ch-propin   return to top
Description

True or False

The information in the resource contains propriety commercial information.

Example:

  • https://catalog.data.gov/dataset/automated-commercial-environment-international-trade-data-system-master-dataset

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Immigration

Term Name: dhs:ch-immigration   return to top
Description

True or False

The information in the resource contains information pertaining to immigration.

Example:

  • Asylee
  • Resident Status
  • Visas
  • Victims of Human Trafficking
  • https://catalog.data.gov/dataset/immigration-statistics-ef958

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Critical Infrastructure

Term Name: dhs:ch-criticalInfrastructure   return to top
Description

True or False

The information in the resource contains critical infrastructure information.

Example:

  • https://catalog.data.gov/dataset/emergency-medical-service-ems-stations
  • https://catalog.data.gov/dataset/ilss-data

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - PCII - Protected Critical Infrastructure Information

Term Name: dhs:ch-pcii   return to top
Description

True or False

The information in the resource contains protected critical infrastructure information.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - biometrics

Term Name: dhs:ch-biometrics   return to top
Description

True or False

The information in the resource contains biometric information.

Example:

  • https://catalog.data.gov/dataset/global-entry-master-dataset
  • https://catalog.data.gov/dataset/arrival-departure-information-system-b3b75

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Dissemination Restrictions

Term Name: dhs:ch-disseminationRestrictions   return to top
Description

True or False

The information has dissemination restrictions.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - LES - Law Enforcement Sensative

Term Name: dhs:ch-les   return to top
Description

True or False

The information in the resource contains information that is deemed Law Enforcement Sensative.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Synthetic

Term Name: dhs:ch-synthetic   return to top
Description

True or False

The data is created manually or artificially apart from the data generated by real-world events. This may include data that is generated for the purposes of modeling and may be generated by a computer simulation. The data approximates real data, but does not necessarily reflect the real world. This includes synthetically derived data (e.g. data that is created programmatically from a set of source data, typically with some algorithm applied)

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Characteristics - Anonymized

Term Name: dhs:ch-anonymized   return to top
Description

True or False

The identifying information in the resource has been removed or changed so that the individual cannot be identified.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype xsd:boolean
Shacl datatype xsd:boolean

Hosting Location

Term Name: dhs:hostingLocation   return to top
Description

Where the data is located?

Example:

  • CIRRUS
  • AWS GovCloud

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:string

Hosted in Cloud

Term Name: dhs:hostedInCloud   return to top
Description

True or False

Is the dataset stored in the cloud?

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Easily Accessible By Creating Component

Term Name: dhs:easilyAccessibleByCreatingComponent   return to top
Description

True or False

The information can be easily accessed by individuals within the DHS component that created it?

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Easily Accessible By All Components

Term Name: dhs:easilyAccessibleByAllComponents   return to top
Description

True or False

The information can be easily accessed by individuals within all of DHS?

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Easily Accessible By General Public

Term Name: dhs:easilyAccessibleByGeneralPublic   return to top
Description

True or False

The information can be easily accessed by the general public?

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:boolean

Record Transmission

Term Name: dhs:recordTransmission   return to top
Description

The date of when the information was submitted into the system.

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:date

Validity Time

Term Name: dhs:validityTime   return to top
Description

The period of time the metadata record is valid for.

Must be in xsd duration form.

Example:

  • P1Y=Period 1 Year
  • P2M=Period 2 Months
  • P3W=Period 3 Weeks
  • P4D=Period 4 Days
  • P1Y2M=Period 1 Year & 2 Months

Namespace http://github.com/usdhs/dcat-tool/
Required No
Rdfs datatype rdfs:Literal
Shacl datatype xsd:duration