Tech Labs | 25 Mar 2021

API guide for the Global Data Workstation

Technology Team

Oxford Economics

Our users can now obtain API access to the Global Data Workstation. With the API, you can create custom saved selections programmatically, run them against any of your subscribed databanks, or integrate Oxford Economics’ data into applications like Tableau and Microsoft Power BI.

Getting an API key

To request an API key, please contact us using the Helpdesk. Eligibility for API provisioning depends on your organisation’s subscription. The key provided will be tied to a new, separate technical user account, which will maintain the same subscription profile as your organisation’s master account.

Alternatively, if you already have an account with access, you can use the login functionality of the /users endpoint to request your key directly. Read below more for information on endpoints.

Endpoints

Each of the endpoints below returns a JSON object containing the corresponding requested data. Most require a valid API token and each can take optional GET and POST data to specify existing selections or databanks.

/Databank

Fetch information about the available Global Workstation databanks. This includes things like the start and end year of availability, the relevant corresponding indicator and location trees, and whether your user account has access.

1. GET /api/databank returns a JSON list of databanks with the details listed above.

curl --request GET --url \
   https://services.oxfordeconomics.com/api/databank \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

2. GET /api/databank/{selection_id} returns a JSON object with databank information.

curl --request GET \
   --url https://services.oxfordeconomics.com/api/databank/{{databank_id}} \<br>
   --header 'api-key: {{api_key}}' \<br>
   --header 'content-type: application/json'

/Download

Request a download based on an existing selection Id or a well-formed selection provided in the POST body.

1. GET /api/download/{id} returns a JSON object based on the saved selection ID provided.

curl --request GET \
   --url 'https://services.oxfordeconomics.com/api/download/{{selection_id}} \<br>
 	?includemetadata=true&page=0&pagesize=5' \<br>
   --header 'api-key: {{api_key}}' \<br>
   --header 'content-type: application/json'

2. POST /api/download?includemetadata={true/false} returns a JSON object with the data described by the selection provided in the request body. Click here for a breakdown of the selection request dictionary below.

Note: you can opt to append paging options to either of the above methods via &page={page number}&pagesize={size} where pagesize dictates the number of elements on each page.

curl --request POST \
   --url 'https://services.oxfordeconomics.com/api/download?includemetadata=true&page=0&pagesize=5' \<br>
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json' \
   --data '{
   "MeasureCode": "L",
   "IsTemporarySelection": true,
   "DatabankCode": "WDMacro",
   "Sequence": "EarliestToLatest",
   "GroupingMode": false,
   "TransposeColumns": null,
   "Order": "IndicatorLocation",
   "IndicatorSortOrder": "AlphabeticalOrder",
   "LocationSortOrder": "AlphabeticalOrder",
   "SortedColumnName": null,
   "SortedColumnOrder": null,
   "Format": 0,
   "LegacyDatafeedFileStructure": false,
   "Variables": [
        {
              "VariableCode": "CPI",
              "ProductTypeCode": "WMC",
              "MeasureCodes": [
                    "L"
              ]
        },
        {
              "VariableCode": "GDP$",
              "ProductTypeCode": "WMC",
              "MeasureCodes": [
                    "L",
                    "PY",
                    "DY"
              ]
        }
   ],
   "Regions": [
        {
              "DatabankCode": "WDMacro",
              "RegionCode": "GBR"
        },
        {
              "DatabankCode": "WDMacro",
              "RegionCode": "USA"
        }
   ],
      "ShareCodeId": null,
      "ListingType": "Private",
      "IsDatafeed": false,
      "StartYear": 2015,
      "EndYear": 2021,
      "Frequency": "Annual",
      "StackedQuarters": false
  }'

/FileDownload

Run and download a saved selection as either an Excel or CSV file. Depending on the expected size of the requested dataset and demand on our servers, /FileDownload might not return immediately. For this reason, we have included the /QueueDownload functionality, which allows the user to submit a request and then later poll for readiness (see below).

1. GET /api/filedownload/{id} returns a JSON object based on the saved selections provided in the controller dictionary in the request body.

curl --request GET \
   --url https://services.oxfordeconomics.com/api/filedownload/{{selection_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

2. POST /api/filedownload returns a JSON object based on the saved selection ID provided. Click here to see the controller configuration dictionary below.

curl –request POST \
   --url https://services.oxfordeconomics.com/api/filedownload \
   --header 'api-key: {{api_key}}' \
  --header 'content-type: application/json'
--data '{
   "format": “csv”,
   "name": "download name",
   "selections": [
        {
              {{selection id}}
        },
 }'

/QueueDownload

Run and download a saved selection as either an Excel or CSV file. The API returns two URLs: ReadyUrl, which you can poll to see whether the download is ready; and URL, which you can download the data from once ready.

1. GET /api/queuedownload/{id} returns a JSON object containing URL for download and ReadyUrl for polling download readiness..

curl --request GET \
   --url https://services.oxfordeconomics.com/api/queuedownload/{{selection_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

2. POST /api/queuedownload returns a JSON object containing URL for download and ReadyUrl for polling download readiness.

<code> curl --request GET \<br>
   --url https://services.oxfordeconomics.com/api/queuedownload \<br>
   --header 'api-key: {{api_key}}' \<br>
   --header 'content-type: application/json'
--data '{
   "format": “csv”,
   "name": "download name",
   "selections": [
        {
              {{selection id}}
        },
 }'

/Region

Shows which regions are available for a given databank. This includes the RegionCode value, which you can use to build custom selections.

1. GET /api/region/{id} returns a JSON list object with the regions available for a given databank.

curl --request GET \
   --url https://services.oxfordeconomics.com/api/region/{{databank_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

/SavedSelections

Create saved selections, or access and update existing ones.

1. GET /api/savedselections/{id} returns a JSON object describing the selection at the ID provided, including things like start and end date and variables requested.

curl --request GET \
--url https://services.oxfordeconomics.com/api/savedselections/{{selection_id}} \
--header 'api-key: {{api_key}}' \
--header 'content-type: application/json'

2. PUT /api/savedselections/{id} takes a selection dictionary in the post body and updates the selection with the provided ID.

Note: this does not return the updated selection object.

curl --request PUT \
  --url https://services.oxfordeconomics.com/api/savedselections/{{selection_id}} \
  --header 'api-key: {{api_key}}' \
  --header 'content-type: application/json' \
  --data '{
  "Id": "{{selection_id}}",
  "SelectionType": "QuerySelection",
  "MeasureCode": "L",
  "IsTemporarySelection": false,
  "DatabankCode": "WDMacro",
  "Sequence": "EarliestToLatest",
  "GroupingMode": false,
  "TransposeColumns": null,
  "Order": "IndicatorLocation",
  "IndicatorSortOrder": "AlphabeticalOrder",
  "LocationSortOrder": "AlphabeticalOrder",
  "SortedColumnName": null,
  "SortedColumnOrder": null,
  "Format": 0,
  "LegacyDatafeedFileStructure": false,
  "Variables": [
    {
      "VariableCode": "CPI",
      "ProductTypeCode": "WMC",
      "MeasureCodes": [
        "L"
      ]
    },
    {
      "VariableCode": "GDP$",
      "ProductTypeCode": "WMC",
      "MeasureCodes": [
        "L",
        "PY",
        "DY"
      ]
    }
  ],
  "Regions": [
    {
      "DatabankCode": "WDMacro",
      "RegionCode": "GBR"
    },
    {
      "DatabankCode": "WDMacro",
      "RegionCode": "USA"
    }
  ],
  "ShareCodeId": null,
  "ListingType": "Private",
  "IsDatafeed": false,
  "Name": "Selection - (Updated yyyy/mm/dd)",
  "StartYear": 2015,
  "EndYear": 2021,
  "Frequency": "Annual",
  "StackedQuarters": false
}'

3. DELETE /api/savedselections/{id} deletes a user’s selection with the provided ID.

curl --request DELETE \
   --url https://services.oxfordeconomics.com/api/savedselections/{{selection_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

4. POST /api/savedselections takes a selection dictionary in the post body and creates and returns a new selection object.

Note: Click here for a breakdown of the selection request dictionary below.

curl --request POST \
  --url https://services.oxfordeconomics.com/api/savedselections \
  --header 'api-key: {{api_key}}' \
  --header 'content-type: application/json' \
  --data '{
  "SelectionType": "QuerySelection",
  "MeasureCode": "L",
  "IsTemporarySelection": false,
  "DatabankCode": "WDMacro",
  "Sequence": "EarliestToLatest",
  "GroupingMode": false,
  "TransposeColumns": null,
  "Order": "IndicatorLocation",
  "IndicatorSortOrder": "AlphabeticalOrder",
  "LocationSortOrder": "AlphabeticalOrder",
  "SortedColumnName": null,
  "SortedColumnOrder": null,
  "Format": 0,
  "LegacyDatafeedFileStructure": false,
  "Variables": [
    {
      "VariableCode": "CPI",
      "ProductTypeCode": "WMC",
      "MeasureCodes": [
        "L"
      ]
    },
    {
      "VariableCode": "GDP$",
      "ProductTypeCode": "WMC",
      "MeasureCodes": [
        "L",
        "PY",
        "DY"
      ]
    }
  ],
  "Regions": [
    {
      "DatabankCode": "WDMacro",
      "RegionCode": "GBR"
    },
    {
      "DatabankCode": "WDMacro",
      "RegionCode": "USA"
    }
  ],
  "ShareCodeId": null,
  "ListingType": "Private",
  "IsDatafeed": false,
  "Name": "New selection",
  "StartYear": 2015,
  "EndYear": 2021,
  "Frequency": "Annual",
  "StackedQuarters": false
}'

/ShapedDownload

Download saved selections configured into shaped tables.

1. POST /api/shapeddownload/{id} returns a JSON dictionary of download data based on the selection ID provided and the shape configuration sent in the request body.

curl --request POST \
  --url https://services.oxfordeconomics.com/api/shapeddownload/{{selection_id}} \
  --header 'api-key: {{api_key}}' \
  --header 'content-type: application/json' \
  --data '{
	"Pivot": "false",
	"StackedQuarters": "false",
	"Frequency": "Both"
 }'

/Tree

Provides useful information on a databank’s related trees.

1. GET /api/tree/Locations_{id} returns a JSON list of a databank’s related locations.

 curl --request GET \
   --url https://services.oxfordeconomics.com/api/tree/Locations_{{databank_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

2. GET /api/tree/Indicators_{id} returns a JSON list of a databank’s related indicators.

 curl --request GET \
   --url https://services.oxfordeconomics.com/api/tree/Indicators_{{databank_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

/Users

Gets your user data or create a new session by logging in.

1. GET /api/users/{id} returns a JSON object of user information with the ContactId provided, including a list of available saved selections. If id is set to “me”, this endpoint will return information for the user with the API key in the request header.

 curl --request GET \
   --url https://services.oxfordeconomics.com/api/users/{{contact_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

2. POST /api/users returns a JSON object with user information, including a list of available saved selections, as well as the user’s API key. See the request dictionary entry for user credentials below.

 curl --request POST \
   --url https://services.oxfordeconomics.com/api/users \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json' \
   --data '{
        "Username": {{user_name}},
        "Password": {{password}}
 }'

/Variable

Shows which variables are available for a given databank. This includes the VariableCode value, which you will use in building custom selections.

1. Get /api/variable/{id} returns a JSON list object with the indicators available for a given databank.

 curl --request GET \
   --url https://services.oxfordeconomics.com/api/variable/{{databank_id}} \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

Request header

HTTP requests to the API require two headers be set: api-key, which should be set to the user’s API key, and content-type, which is typically “application/json”.

Example request, indicating proper HTTP headers:

curl --request GET --url \
   https://services.oxfordeconomics.com/api/databank \
   --header 'api-key: {{api_key}}' \
   --header 'content-type: application/json'

Key request dictionaries

Note that for the purpose of describing these dictionaries, the ( and ) characters indicate a choice and should be omitted in practice. For example, format: (“csv”, “excel”) indicates a choice between the strings csv and excel, so in practice this might look like format: “csv” or format: “excel”.

In contrast, the [ and ] characters indicate an array and must not be omitted.

Controller

Controller layout:

 {
   format: ("csv", "excel"),
   name: string,
   selections: [selection id] 
 }

Example controller:

 {
   format: "csv",
   name: "GDP$ USA, GBR 2016-21",
   selections: ["cc0547f1-4600-4ace-88a0-d86876508a15"] 
 }

Selection

Selection layout:

{ 
   Id: string,
   Name: string,
   DatabankCode: string,
  MeasureCode: see measure codes,
  StartYear: int,
  EndYear: int,
  StackedQuarters: ("true", "false"),
  Frequency: ("Annual", "Quarterly", "Both"),
  GroupingMode: ("true", "false"),
  IndicatorSortOrder: ("AlphabeticalOrder", "TreeOrder"),
  IsDatafeed: ("true", "false"),
  IsTemporarySelection: ("true", "false"),
  LegacyDatafeedFileStructure: ("true", "false"), 
  ListingType: ("Hidden", "Private", "Company", "Public", "Shared"),
  LocationSortOrder: ("AlphabeticalOrder", "TreeOrder"),
  Order: ("IndicatorLocation", "LocationIndicator"),
  Regions:
  [ {
   DatabankCode: string,
   RegionCode: string
  } ]
  Variables:
  [ {
   ProductTypeCode: string,
   VariableCode: string,
   MeasureCodes: [see measure codes]
  } ]
 }

Example selection:

{
     DownloadUrl: "/api/download/46bbd06a-5bf2-4278-b96f-037f15e2f256",
     Id: "46bbd06a-5bf2-4278-b96f-037f15e2f256",
     SelectionType: "QuerySelection",
     MeasureCode: "L",
     IsTemporarySelection: false,
     LastUpdate: "2017-05-08T18:51:54.363",
     DatabankCode: "WDMacro",
     Sequence: "EarliestToLatest",
     GroupingMode: false,
     TransposeColumns: null,
     Order: "IndicatorLocation",
     IndicatorSortOrder: "AlphabeticalOrder",
     LocationSortOrder: "AlphabeticalOrder",
     SortedColumnName: null,
     SortedColumnOrder: null,
     Format: 0,
     LegacyDatafeedFileStructure: false,
     Variables: [
         {
             VariableCode: "CPI",
             ProductTypeCode: "WMC",
             MeasureCodes: [
                 "L"
             ]
         },
         {
             VariableCode: "GDP$",
             ProductTypeCode: "WMC",
             MeasureCodes: [
                 "L",
                 "PY",
                 "DY"
             ]
         }
     ],
     Regions: [
         {
             DatabankCode: "WDMacro",
             RegionCode: "GBR"
         },
         {
             DatabankCode: "WDMacro",
             RegionCode: "USA"
         }
     ],
     ShareCodeId: null,
     ListingType: "Private",
     IsDatafeed: false,<br>
     Name: "Selection - (Updated 2017/05/09)",
     StartYear: 2015,
     EndYear: 2021,
     Frequency: "Annual",
     StackedQuarters: false
  }
     DownloadUrl: "/api/download/46bbd06a-5bf2-4278-b96f-037f15e2f256",
     Id: "46bbd06a-5bf2-4278-b96f-037f15e2f256",
     SelectionType: "QuerySelection",
     MeasureCode: "L",
     IsTemporarySelection: false,
     LastUpdate: "2017-05-08T18:51:54.363",
     DatabankCode: "WDMacro",
     Sequence: "EarliestToLatest",
     GroupingMode: false,
     TransposeColumns: null,
     Order: "IndicatorLocation",
     IndicatorSortOrder: "AlphabeticalOrder",
     LocationSortOrder: "AlphabeticalOrder",
     SortedColumnName: null,
     SortedColumnOrder: null,
     Format: 0,
     LegacyDatafeedFileStructure: false,
     Variables: [
         {
             VariableCode: "CPI",
             ProductTypeCode: "WMC",
             MeasureCodes: [
                 "L"
             ]
         },
         {
             VariableCode: "GDP$",
             ProductTypeCode: "WMC",
             MeasureCodes: [
                 "L",
                 "PY",
                 "DY"
             ]
         }
     ],
     Regions: [
         {
             DatabankCode: "WDMacro",
             RegionCode: "GBR"
         },
         {
             DatabankCode: "WDMacro",
             RegionCode: "USA"
         }
     ],
     ShareCodeId: null,
     ListingType: "Private",
     IsDatafeed: false,
     Name: "Selection - (Updated 2017/05/09)",
     StartYear: 2015,
     EndYear: 2021,
     Frequency: "Annual",
     StackedQuarters: false

Shape configuration

Shape configuration layout:

{ 
Pivot: ("true", "false"),
StackedQuarters: ("true", "false"),
 Frequency: ("annual", "quarterly", "both")
}

Example shape configuration:

{ 
Pivot: ("true", "false"),
StackedQuarters: ("true", "false"),
 Frequency: "both"
}

User credentials

User credentials layout:

 {
  Username: string,
  Password: string
}

Example credentials:

 {
  Username: [email protected],
  Password: example_password
}

Key response dictionaries

Note that for the purpose of describing these dictionaries, the ( and ) characters indicate a choice and should be omitted in practice. For example, format: (“csv”, “excel”) indicates a choice between the strings csv and excel, so in practice this might look like format: “csv” or format: “excel”.

In contrast, the [ and ] characters indicate an array and must not be omitted.

Download

[ 

 {

  DatabankCode: string,

  ProducTypeCode: string,

  LocationCode: string,

  VariableCode: string,

  MeasureCode: see measure codes,

  AnnualData:

  {

    YYYY: float,

    YYYY: float,

    YYYY: float,

    …

  },

  QuarterlyData:

  {

    YYYYQQ: float,

    YYYYQQ: float,

    YYYYQQ: float,

    …

  }

 }

]

Selection

This is the same as the selection dictionary in the request body. However, the response version always includes fields that are otherwise optional in the request. These include: LastUpdate, ContactId, ShareCodeId, and IsDataFeed.

User

{

  ContactId: string,

  CompanyId: string,

  UserName: string,

  ApiKey: string,

  Email: string,

  Title: string,

  LastName: string,

  FirstName: string,

  CompanyName: string,

  IsTrialist: string,

  IsLockedOut: string,

  Roles: [string],

  SavedSelections:

  [ {

   Id: string,

   Name: string,

   Url: string

  } ]

 }

Measure codes

Several of these request and response objects have a MeasureCode field. This value describes how the annual data is represented. The available options are:

Conventions, limits, and access

Conventions

The following list describes common conventions used by our API and should help to inform the construction of valid HTTP requests:

Rate limits

The Global Data Workstation imposes rate limits in order to prevent abuse and keep the service fast and accessible to everyone. Currently, it doesn’t impose a hard limit. I.e. there is no policy that will drop API requests after a certain limit is reached. However, this is subject to change, in which case requests made beyond this threshold will be rejected with a 503 Service Unavailable HTTP status.

We do employ a soft limit, which is currently around 60 requests per minute. If this limit has been reached, then a delay will be imposed on subsequent requests in order to bring down the number of requests into the rate limit window. Please note: this rate is not guaranteed and may fluctuate with server load.

The following HTTP headers are included in every API response:

Page limits

The Global Data Workstation limits the maximum number of data rows returned in a single API call to 20,000. To retrieve more data than this, you will need to use paging. See below on ‘Downloading entire databanks’.

Authorisation and variables access

Each API key is authenticated by an organisation’s subscription and provides access only to databanks and variables authorised in the subscription. Any attempt to circumvent this authorisation and/or use another organisation’s API key will be in breach of the terms and conditions of your subscription contract. The API does not return data for unauthorised variable requests, and it is throttled to prevent abuse and protect the main service.

Example applications

We have a full range of examples on Github that cover everything from C# and Python to Tableau and Microsoft Power BI.

You can also test out the full range of endpoints using the API Swagger, here, or our Postman collection, here.

A note about redirects

The File Download and Queue Download endpoints employ a redirect response to indicate successful completion and to pass along the newly generated download URL. Any client-side API call must expose this redirect URL (i.e. the download URL). However, in many web browsers this is will not be possible because of CORS restrictions. Therefore, setting up an intermediary application is necessary to make these calls and forward along the redirect URL. In our example code, we have included a simple Webtask script (js/file-download.js) to indicate one way this can be done.

Webtasks are one method for launching simple, lightweight web applications like this, so your code can run remotely without the need to configure server hardware or software. This particular framework uses Javascript and a wide range of NPM modules. More on Webtasks at https://webtask.io/.

var request = require(‘[email protected]’)

API_URL = ‘https://services.oxfordeconomics.com’;

// this function takes a saved selection id and makes a GET

// request to the filedownload endpoint, forwarding the download

// url, which appears as the redirect location in the response headers

// from the api

//

// note: request headers and body are accessible via the context

// object e.g. context.data.*, context.headers.* and the response

// data is simply handed off to the callback function. in this

// case we’re returning the redirect url as plain text

module.exports = function (context, callback)

{

    var api_url = API_URL + ‘/api/filedownload’;

    var api_key = context.data.api_key;

    var api_resource_id = context.data.api_resource_id || “”;

    if (!api_resource_id || !api_key)

    {

            callback(400, null);

    }

    else

    {

      api_resource_id = “/” + api_resource_id;

      var options =

    {

      method: ‘GET’,

      followRedirect: false,

      uri: api_url + api_resource_id,

      headers:

      {

        ‘Api-Key’: api_key

      }

    };

    request(options, function(error, response, body)

    {

      if (response.statusCode >= 300 && response.statusCode < 400)

      {

        callback(null, response.headers.location);

      }

      else

      {

        callback(response.statusCode, error);

      }

    });

}

Downloading entire databanks

In most circumstances, narrowly-defined selections are the best and quickest method to download data with the API. However, in some cases it may be necessary to fetch everything all at once. The way to do this is nearly identical to any other download but for the following two conditions:

A simple example Python implementation, using a custom selection:

#

# Copyright (c) 2022 Oxford Economics Ltd. All rights reserved.

# Licensed under the MIT License. See LICENSE file in the

# project root for full license information.

#

import json

import requests

API_KEY = # insert api key

BASE_URL = ‘https://services.oxfordeconomics.com’

sample_selection = {

    ‘DatabankCode’: ‘WDMacro’,

    ‘Frequency’: ‘Annual’,

    ‘GroupingMode’: ‘false’,

    ‘IndicatorSortOrder’: ‘AlphabeticalOrder’,

    ‘IsTemporarySelection’: ‘true’,

    ‘ListingType’: ‘Private’,

    ‘LocationSortOrder’: ‘AlphabeticalOrder’,

    ‘Order’: ‘IndicatorLocation’,

    ‘Sequence’: ‘EarliestToLatest’,

    ‘StackedQuarters’: ‘false’,

    ‘StartYear’: 1980,

    ‘EndYear’: 2045,

    # note: the fields below have been assigned empty lists

    ‘Regions’: [

    ],

    ‘Variables’: [

    ]

}

def databank_download(selection_dictionary):

    headers = {‘Accept’: ‘application/json’,

        ‘Api-Key’: API_KEY,

        ‘Content-Type’: ‘application/json; charset=utf8’ }

    def _download_url(base_url, page, pagesize):

        url = base_url + ‘/api/download?includemetadata=true’

        return url + ‘&page={1}+&pagesize={2}’.format(base_url, page, page_size)

    page = 0

    page_size = 5000

    data = None

    while True:

        # note: _download_url returns a link of the form

        # ../api/download?includemetadata=true&page={page}+&page_size={page_size}

        response = requests.post(_download_url(BASE_URL, page, page_size),

            headers=headers,

            data=json.dumps(selection_dictionary))

        new_data = response.json()

        page += 1

        if data is None:

            data = new_data

        else:

            data.extend(new_data)

        if len(new_data) < page_size:

            break

    return data

if __name__ == ‘__main__’:

    data = databank_download(sample_selection)

    data_file = open(‘data_file.tmp’, ‘w’)

    json.dump(data, data_file)

A more advanced example Python implementation, using a saved selection ID:

#

# Copyright (c) 2017 Oxford Economics Ltd. All rights reserved.

# Licensed under the MIT License. See LICENSE file in the

# project root for full license information.

#

import sys

import json

import requests

import time

# WARNING: In production these values should be provided via environment or command line args

API_KEY = ‘<<ENTER API KEY HERE>>’

SELECTION = {‘ID’: ‘<<ENTER SELECTION ID HERE>>’, ‘NAME’: ‘<<ENTER NAME OF SELECTION HERE>>’}

PAGE_SIZE = 20000   # set to number of records per page (20k is the max allowed by API)

MAX_PAGES = None    # set to None for as many pages as possible

CHECKPOINT = 5      # checkpoint (save) after this many pages

def databank_download(f, selection_id, show_progress=True):

    BASE_URL = ‘https://services.oxfordeconomics.com’

    INCLUDE_METADATA=’false’ # or set = ‘true’

    headers = { ‘Accept’: ‘application/json’,

                ‘Api-Key’: API_KEY,

                ‘Content-Type’: ‘application/json; charset=utf8’ }

    def _download_url(base_url, page, pagesize):

        return  f'{base_url}/api/download/{selection_id}?includemetadata={INCLUDE_METADATA}&page={page}&pagesize={page_size}’

    page = 0

    page_size = PAGE_SIZE

    max_pages = MAX_PAGES

    data = None

    api_request_time_total = 0.0

    disk_write_time_total = 0.0

    while True:

        # download selection_id with some escalating wait retry logic

        max_retries = 5

        retries = 0

        while retries < max_retries:

            try:

                start = time.time()

                response = requests.get(_download_url(BASE_URL, page, page_size), headers=headers)

                end = time.time()

                api_request_time_total += (end – start)

                break

            except ConnectionError as msg:

                print(f’ConnectionError Exception!\n{msg}’)

                retries += 1

                print(f’Waiting to retry ({retries}/5) in {retries*5} seconds…\n{msg}’)

                time.sleep(retries*5)

                pass

            except TimeoutError as msg:

                print(f’ConnectionError Exception!\n{msg}’)

                retries += 1

                print(f’Waiting to retry ({retries}/5) in {retries*5} seconds…\n{msg}’)

                time.sleep(retries*5)

                pass

            except Exception as msg:

                print(f’Exception!\n{msg}’)

                raise RuntimeError(‘Something bad happened!’)

        if retries == max_retries:

            return page, api_request_time_total, disk_write_time_total

        page_data = response.json()

        page += 1

        if show_progress:

            sys.stdout.write(‘#’)

            sys.stdout.flush()

        # accumulate data

        if data is None:

             data = page_data

        else:

             data.extend(page_data)

        # dump to file every so many pages

        if page % CHECKPOINT == 0 and data != None:

            sys.stdout.write(‘+’)

            sys.stdout.flush()

            start = time.time()

            json.dump(data, f)

            end = time.time()

            disk_write_time_total += (end – start)

            data = None

        # get all pages or limit to max_pages

        if max_pages == None:

            pass

        elif page == max_pages:

            break

        # have we got to the end

        if len(page_data) < page_size:

            break

    # dump the last chunk of data, if any

    if not data is None:

        start = time.time()

        json.dump(data, f)

        end = time.time()

        disk_write_time_total += (end – start)

    return page, api_request_time_total, disk_write_time_total

if __name__ == ‘__main__’:

    SELECTION_ID = SELECTION[‘ID’]

    DATA_FILE = SELECTION[‘NAME’] + ‘.tmp’

    start = time.asctime(time.localtime(time.time()))

    print(f’Started @ {start} | DATA_FILE={DATA_FILE}, PAGE_SIZE={PAGE_SIZE}, MAX_PAGES={MAX_PAGES}, CHECKPOINT={CHECKPOINT}’)

    with open(DATA_FILE, ‘w’) as f:

        pages, api_request_time_total, disk_write_time_total = databank_download(f, SELECTION_ID)

    end = time.asctime(time.localtime(time.time()))

    print(f’\nCOMPLETE! Downloaded {pages} pages’)

    print(f’Started @ {start}’)

    print(f’Ended @ {end}’)

    print(‘Total API Request Time: %.1f secs, %.1f mins’ % (api_request_time_total, api_request_time_total/60))

    print(‘Total Disk Write Time: %.1f secs, %.1f mins’ % (disk_write_time_total, disk_write_time_total/60))

”’

Please note:

If you’re using vscode to develop/debug this program and it encounters an SSL error, then do the following:

#     mklink “<YOUR ANACONDA LOCATION>\Anaconda3\DLLs\libcrypto-1_1-x64.dll” “<YOUR ANACONDA LOCATION>\Anaconda3\Library\bin\libcrypto-1_1-x64.dll”

#     mklink “<YOUR ANACONDA LOCATION>\Anaconda3\DLLs\libssl-1_1-x64.dll” “<YOUR ANACONDA LOCATION>\Anaconda3\Library\bin\libssl-1_1-x64.dll”

You can also physically copy them over.

The exception you may see will be something like this:

#     HTTPSConnectionPool(host=’services.oxfordeconomics.com’, port=443):

#     Max retries exceeded with url: /api/download/…?includemetadata=false&page=0+&pagesize=2000

#     (Caused by SSLError(“Can’t connect to HTTPS URL because the SSL module is not available.”))

Related Services

Post

De l’opportunité à l’impact – Évaluer les avantages économiques, sociétaux et culturels de YouTube en Belgique

Cette étude évalue la contribution de YouTube au PIB belge et au taux d’emploi en Belgique, ainsi que son impact plus large sur la société et la culture.

Find Out More

Post

Van mogelijkheden naar impact – Beoordeling van de economische, maatschappelijke en culturele voordelen van YouTube in België

Deze studie beoordeelt de bijdrage van YouTube aan het BBP en de tewerkstelling in België alsook de bredere impact op de maatschappij en cultuur.

Find Out More

Post

From opportunity to impact – Assessing the economic, societal, and cultural benefits of YouTube in Belgium

This study assesses YouTube's contribution to GDP and employment in Belgium, and its broader impact on society and culture.

Find Out More