Fetching Datasets into Excel Using VBA
Objective
The goal of this macro is to retrieve datasets from the Dateno API and load them into a Microsoft Excel workbook. Each dataset is stored in a separate sheet and made available for review or further processing.
Plan
The macro operates on the initial state where the Datasets worksheet contains a list of datasets to be loaded. This list must include two columns:
- Dataset name—specifies the name of the worksheet where the dataset will be loaded
- Dataset ID—the unique identifier of the dataset in the Dateno registry
Once the list is prepared, the user clicks the Load datasets from Dateno button to begin the process. For each dataset in the list, the macro performs the following steps:
- Fetching the dataset card from the Dateno API using the Dataset ID.
- Extracting the URL of the associated CSV file from the dataset card.
- Downloading and parses the CSV file into a two-dimensional array.
- Creating a new worksheet (if it does not exist) using the Dataset name and copies the parsed data into it.
Source
The functionality is implemented in a VBA sample. To use it, you must have Microsoft Excel with macros enabled.
WARNING
Before enabling macros, make sure that antivirus software is installed and properly configured on your computer.
Comments
getDatenoSearchApiUrl()
Purpose
Returns the base URL of the Dateno search API.
Arguments
None.
Returns
A string containing the base API URL.
getMyDatenoApiKey()
Purpose
Returns the user's personal Dateno API key.
Arguments
None.
Returns
A string containing the API key.
assembleDatasetCardUrl(datasetId As String)
Purpose
Constructs the complete API URL for fetching a dataset card.
Arguments
datasetId
: Unique identifier of the dataset.
Returns
A string containing the full API request URL.
fetchDatasetCard(datasetId As String)
Purpose
Fetches a dataset card from the Dateno API and returns it as a raw JSON string.
Arguments
datasetId
: Unique identifier of the dataset.
Returns
A raw JSON string representing the dataset card.
getDatasetCsvTableUrl(datasetCard As Object)
Purpose
Retrieves the URL of a CSV resource from a parsed dataset card.
Arguments
datasetCard
: String returned byfetchDatasetCard
.
Returns
A string containing the CSV file URL, or an empty string if not found.
Operational Principle
Sad but true: VBA does not offer a native way to parse JSON. While third-party JSON parsers for VBA exist, using them here would be overkill. Instead, we treat the dataset card as a plain text string and extract the descriptor of the target data file using regular expressions.
The structure of the descriptor we're looking for is consistent and recognizable. It appears as a JSON block containing both a "format": "CSV"
entry and a "url"
pointing to the downloadable file. For example:
---> {
"id": null,
"name": "CSV",
"datasize": null,
------> "format": "CSV",
"mimetype": null,
------> "url": "https://..."
---> }
The function scans all bracketed {...}
blocks in the dataset card. It selects the one that contains both "format": "CSV"
and "url"
. Once found, it extracts the url
field and returns it. This allows us to reliably retrieve the correct CSV resource without relying on a full JSON parser.
fetchDatasetCsvTable(datasetCsvTableUrl As String)
Purpose
Downloads the CSV content from the provided URL.
Arguments
datasetCsvTableUrl
: Direct download link to the CSV file.
Returns
CSV data as a plain string, or an error message if the request fails.
detectLineSeparator(csvTable As String)
Purpose
Determines the line separator used in the CSV string.
Arguments
csvTable
: CSV content as a string.
Returns
A string representing the line separator (CRLF, CR, or LF).
detectCsvSeparator(csvTable As String)
Purpose
Determines the column separator used in the CSV content.
Arguments
csvTable
: CSV content as a string.
Returns
A string containing the separator character (comma or semicolon).
parseCsvTable(csvTable As String)
Purpose
Parses the CSV string into a 2D array of values.
Arguments
csvTable
: CSV content as a string.
Returns
A two-dimensional array representing rows and columns of the CSV table.
sheetExists(sheetName As String)
Purpose
Checks if a worksheet with the specified name exists in the workbook.
Arguments
sheetName
: Name of the sheet to check.
Returns
True if the sheet exists, otherwise False.
guaranteeSheet(datasetName As String)
Purpose
Ensures a sheet with the specified name exists in the workbook. Creates it if missing.
Arguments
datasetName
: Name of the dataset and the desired sheet.
Returns
None.
getActiveSheetName()
Purpose
Returns the name of the currently active worksheet.
Arguments
None.
Returns
A string containing the sheet name.
copyTableToSheet(table As Variant, sheetName As String)
Purpose
Copies a 2D array of data to a worksheet.
Arguments
table
: 2D array of parsed CSV data.sheetName
: Name of the worksheet to paste into.
Returns
None.
loadDatasetCsvTable(datasetId As String, datasetName As String)
Purpose
Performs the full workflow for one dataset: fetches metadata, downloads CSV, parses it, and writes it to a worksheet.
Arguments
datasetId
: Unique identifier of the dataset.datasetName
: Name of the worksheet to use.
Returns
None.
loadDatasetCsvTablesToWorkbook(tocSheetName As String)
Purpose
Iterates over the dataset list on the specified sheet and loads each dataset using loadDatasetCsvTable
.
Arguments
tocSheetName
: Name of the sheet containing the dataset list.
Returns
None.
cmdLoadDatasets()
Purpose
Entry point macro. Loads datasets listed on the active sheet.
Arguments
None.
Returns
None.