Paging Using the CloudConnect REST Connector

Related Tags: etl data loading cloudconnect

The REST Connector in CloudConnect offers a great deal of flexibility for making API calls to any RESTful API. However, when the volume of requested data is too large for the API to pass in a single response, the CloudConnect user must acquire the data using a series of requests and responses. This process, called pagination, is handled through a set of functions integrated into the REST Connector.

NOTE: This article assumes some knowledge of API connections and how to use variables in CloudConnect Designer. For more information on CloudConnect, see CloudConnect Designer documentation.

Pagination functions

In the REST Connector component, the Response Handling Functions enable you to manage additional requests of the API and how to reach the next page of data. Although there are three available functions, pagination requires the use of only the following functions:

  • generateRequestParameters - this function is executed before each API call from the REST Connector.
  • checkResponse - this function is executed after each API call.

The checkResponse function allows you to analyze the returned status (responseStatus), headers (responseHeaders), and body (responseBody).

  • If this function returns “CONTINUE”, the connector calls generateRequestParameters, and the process starts again.
  • If the checkResponse function returns “DONE_WITH_OUTPUT”, the connector finishes, and the data flows through the output port of the REST Connector component.
  • Note that you can also use the checkResponse function to do rate limit handling or to specify other error messages.

Part of the response body is the paging key, which identifies how the pages of data are marked. In the generateRequestParameters function, the paging key must be parsed from the response, and if more records remain to be retrieved in subsequent calls, it must be submitted back in the body of the next request. Some example paging keys are the following:

  • Count of records per page
  • A query parameter in the URL
  • Hash value

REST Connector Pagination Diagram
REST Connector Pagination Diagram

How pagination works is specific to each API, as each one handles them differently. The basic development steps are:

  1. Learn how the API does pagination. Integrate the paging key of the request URL into the URL itself as a variable called ${request_body}.
  2. generateRequestParameters: Parse the API response (the responseBody). Extract the paging key of the URL and set it equal to the ${request_body} parameter.
  3. checkResponse: Determine if pagination is complete. Return “CONTINUE” to keep calling the API, or “DONE_WITH_OUTPUT” if the task is complete.

Example - Facebook Graph API - Posts

Below is an example of this process that shows how to page through responses from an API call to the Posts endpoint on the Facebook Graph API, which can be used to retrieve information about your Facebook posts.

Here is an example of the syntax for the API call (\ indicates the URL continues on the next line; it is not part of the URL):

https://graph.facebook.com/${page_id}/promotable_posts? \ fields=id,from,created_time,message&limit=${RECORDS_PER_PAGE} \ &since=${since}&until=${request_body}&access_token=${page_token}

This API call returns all posts associated with your page (${page_id}) that occur in the time frame ranging from ${since} to ${request_body}.

  • You must define the starting point (${since}) and the ending point prior to making the first API call. In the sample code below, these two values are included in the input metadata for the component as $in.0.since and $in.0.until respectively.
  • The API results are delivered in sets as determined by the variable ${RECORDS_PER_PAGE}. Page volume is controlled by setting this variable.

NOTE: Some APIs may impose limits to the number of records per call. Please review the limits of the API before setting this value.

generateRequestParameters function

For this API, you must dynamically update the ${request_body} parameter to make subsequent API calls. This parameter is populated from content in the response body.

Additional pages are denoted within the response body by the expression &until=. This expression is used to demarcate the date that follows it, which is then stored as the ${request_body} parameter and passed back into the URL for the API call.

  • For the first use of the API call, the sample code sets ${request_body} equal to the value of “until” on the incoming metadata ($in.0.until).
  • By default, the first use of the API call sets ${request_body} to be an empty string ("").
  • For subsequent API calls, the code dynamically parses the response to extract the value of ${request_body} for the paging key for the next request.
 
function map[string, string] generateRequestParameters(map[string, string] inputEdgeRecord, integer iterationNumber, integer lastResponseStatus, map[string, string]lastResponseHeaders, string lastResponseBody) {
 
    // Copy all input parameters into the request parameters map.
    map[string, string] requestParams = inputEdgeRecord;
   
    // Set the number of records per page that the API will return. Update the parameter with this value.
    integer RECORDS_PER_PAGE = 100;
    requestParams["RECORDS_PER_PAGE"] = toString(RECORDS_PER_PAGE);
   
    // Determine the index of "&until=" within the most recent Response Body. If “&until=” is not present, the function will return -1
    integer until = indexOf(lastResponseBody,"&until=");
   
    // Check to see if "&until=" is in the response body. If it is, parse out the timestamp that follows. 
    // If it is the first API call, the until variable will be equal to -1, so it is set to $in.0.until.
    requestParams["request_body"] = until == -1 ? $in.0.until : substring(lastResponseBody,until+7,10);
    return requestParams;
}

checkResponse function

After the generateRequestParameters logic is defined, the rules for continuing pagination must be set in the checkResponse function.

For this API, the &until= key appears in the URL if there are additional pages of data available via the API. Parse the responseBody to determine if this key is present.

  • If the key is present, then continue pagination.
  • Else, tell the REST Connector to terminate.
  • If you are pulling a large volume of data, you should use this function to handle rate limiting, if necessary.
 
function string checkResponse(integer responseStatus, map[string, string] responseHeaders, string responseBody) {
// Determine the location of "&until="
    integer untilLocation = indexOf(responseBody,"&until=");
// if "&until=" is present, it will have an index > -1, so continue processing pages.
     if (untilLocation > -1) {
        return "CONTINUE";
     } else {
return "DONE_WITH_OUTPUT";
     }
}

Additional options:

The checkResponse function provides additional options for the following:

  • DONE_NO_OUTPUT - The last iteration has completed. However, no data was received, so no data is sent to the component’s output port.
  • RETRY - Retry the last failed request.
  • FATAL_ERROR - Fatal error occurred. HTTP Connector execution was aborted.

For more information, see the code examples in the REST Connector component.