Skip to main content

Dataset Descriptor

Purpose

A dataset descriptor is a compact metadata structure returned in the dataset search results from the Dateno API. It provides summary information about a dataset but omits the full content available in a dataset card.

Descriptors are primarily used in search responses to support filtering, display dataset titles, and provide basic data attributes. Unlike dataset cards, descriptors do not include resource descriptors or full-length descriptions.

Relation to Dataset Card

A dataset descriptor and a dataset card refer to the same dataset but serve different purposes and have different structures.

AspectDataset DescriptorDataset Card
UsageIncluded in search resultsRetrieved via direct dataset request
ContentSummary metadata onlyFull metadata including resource descriptors
ResourcesNot includedListed in the resources property
Long DescriptionNot includedAvailable in the description field
Use in FilteringYes, via structured fieldsNot used for filtering

Structure

Each dataset descriptor is a JSON object located in the _source field of a hits.hits array element. It contains the following top-level parts:

PropertyDescription
idUnique identifier of the dataset in the Dateno registry
sourceCatalog metadata
datasetDataset metadata assigned in the registry
scoresNumeric scores for dataset ranking in search results

source

Describes the catalog and organization that published the dataset.

PropertyDescription
schemaPlain string indicating the source schema (e.g., ckan, arcgishub)
uidUnique identifier of the catalog in Dateno
owner_nameName of the publisher
softwareDirectory entry describing the maintenance software
catalog_typeDirectory entry identifying the catalog type
nameName of the catalog
macroregionsArray of directory entries for macroregions (see dataset attributes)
countriesArray of directory entries for countries
subregionsArray of directory entries for subregions
langsArray of directory entries for languages
owner_typePlain string describing the type of organization (e.g., "Central government", "Academy")
urlURL of the original catalog

TIP
Dataset descriptors and dataset cards include basic metadata about the associated catalog. To retrieve more comprehensive metadata about a specific catalog use the Fetch Single Catalog request in the Dateno API. Use the uid value with the Fetch Single Catalog request to retrieve full catalog metadata.

dataset

Describes dataset-level metadata assigned during catalog indexing.

PropertyDescription
titleTitle of the dataset
short_textBrief description for previews
formatsList of data formats (e.g., .csv, .json)
tagsArray of user-supplied tags
topicsNormalized topics assigned by indexer
topics_originalOriginal topics from the source catalog
geotopicsGeospatial topics if assigned
urlLink to the original dataset page
num_resourcesNumber of resources in the dataset
date_createdDataset creation date
date_changedDataset last updated date
datatypesData types describing the nature of data
responsibleArray of responsible parties (e.g., publishers or creators)
idInternal identifier of the dataset

scores

Contains scores used by the search engine to rank search results.

PropertyDescription
feature_scoreScore used to order and prioritize results

Filtering

Most fields in the source and dataset objects are used for filtering search results. These include:

  • source.countries.name
  • source.langs.id
  • dataset.formats
  • dataset.topics
  • source.catalog_type

For detailed filter syntax, see Using Filters in Requests.

Example

{
"int_id": "4dc106c6-0027-478e-af07-1c67226a90b0",
"source": {
"schema": "ckan",
"uid": "cdi00000310",
"owner_name": "Government of Alberta",
"software": {
"name": "CKAN",
"id": "ckan"
},
"catalog_type": "Open data portal",
"name": "Government of Alberta open datasets",
"macroregions": [
{
"name": "Northern America",
"id": "021"
}
],
"langs": [
{
"name": "English",
"id": "EN"
}
],
"countries": [
{
"name": "Canada",
"id": "CA"
}
],
"subregions": [
{
"name": "Alberta",
"id": "CA-AB"
}
],
"owner_type": "Regional government",
"url": "https://open.alberta.ca"
},
"id": "8dbfbe735938a118b2a69a6fc8e21c4839561007687ea350c4947ea6f53dbbc5",
"scores": {
"feature_score": 95
},
"dataset": {
"topics_original": [],
"geotopics": [],
"formats": [
".pdf"
],
"topics": [],
"date_created": "2018-07-24T19:01:14.899150",
"short_text": "Summarizes information about rabbit and rodent management in Alberta,",
"num_resources": 1,
"description": "Summarizes information about rabbit and rodent management
in Alberta, and the role of Alberta's Wildlife Act in regulating how
they can be harvested or controlled in the province.",
"title": "Rabbit and rodent management in Alberta",
"date_changed": "2023-08-31T17:33:25.714755",
"url": "https://open.alberta.ca/dataset/rabbit-and-rodent-management-in-alberta",
"datatypes": [
"documents"
],
"has_archive": false,
"tags": [
"rabbits",
"rodents",
"wild species"
],
"responsible": [
{
"role": "Publisher",
"id": "environment1971-1992--1999-2011",
"title": "Environment (1971-1992, 1999-2011)"
}
],
"id": "4dc106c6-0027-478e-af07-1c67226a90b0"
}
}