question

stramzik-1915 avatar image
0 Votes"
stramzik-1915 asked SamSam-3705 commented

How to pass Pagination rules during a rest api calls in copy data activity?

Hi,

I have a pipeline which basically firsts gets a access token by making a http request and once I have the token I'm doing a get request to an API with the access token as a header.

As the REST API o/p is limited I need to make multiple requests for pagination so I did follow the steps mentioned here : https://docs.microsoft.com/en-us/azure/data-factory/connector-rest#pagination-support

There are few problems with the NextPageUrl which i receive from the REST get request, The NextPageUrl starts with http and based on my experience if the Linked_service Baseurl does not start with HTTPS it does not accept the headers passed so I need to change the url from this "http://xyz.com/api/v1/?page=1" to "https://xyz.com/api/v1/?page=1" so i tried the replace expression but It says I cant write it this way "@replace('$.ConnectResponse.Metadata.Paging.NextPageURL', 'http', 'https')" or "@replace($.ConnectResponse.Metadata.Paging.NextPageURL, 'http', 'https')"


Please advise how can I change the string?


Secondly I though okay let me pass the page number manually but how do I add a number to the current page number and stop once it reaches the max?


from my API O/p I get example TotalPages=3 and CurrentPage=1 number

so I though I could pass the queryparameter in pagination like below

key = QueryParameters.page and value = @add($.ConnectResponse.Metadata.Paging.NextPageURL, 1) ----- I've tried with quotes and without quotes --- and it does work as expected.


Please advise how can I fix this?


Lastly


Is this how i pass a header authorization token in pagination?


key = Headers['Authorization']

value = Bearer @{activity('Login').output.access_token}


I mean this is how I pass the header parameters for the rest source so I'm assuming I just need to pass it the same way correct me if I'm wrong.


Thank you in advance.

azure-data-factory
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @stramzik-1915 and thank you for your question. Please let me address each of your asks separately.

The REST connector pagination is rather inflexible in my experience. Often when I run into difficulty with it, I end up placing the activity inside a loop. Doing so gives me better control. However, splitting into multiple copies is not suitable for all sink types. What are you writing to?
If you want to know more about this approach, let me know.

For the authorization, that should be placed under the "additional headers" section, not the "pagination" section. You do not expect new bearer tokens to be returned with each page, correct?

Please tell me more about the headers not being accepted in http vs https. I am trying to understand whether this is a limitation of the API you are calling, or possibly a bug in Data Factory.

Thank you for your patience!

0 Votes 0 ·

Hi @MartinJaffer-MSFT apologies on the delayed response.

My sink type would be Json or CSV, I'm writing the data to Azure DataLake Gen1. Could you please let me know how I can loop over the REST calls?

The Bearer token is active for an hour so there is a possibility that sometimes the REST calls has to loop through 1000's of pages to get the data and if this takes more than an hour then I should refresh the token. Now to answer your question for the pagination I can use the same Bearer token which I used for the first rest call.

Speaking about headers not being accepted in http vs https I think its a DataFactory limitation the reason I say that is because It works perfectly fine on Postman or Python script but when ever I try to do a get request and if my REST base url starts with http:// I get invalid headers error however if I change the base URL to https:// the same headers are accepted.

0 Votes 0 ·

Thank you for your response @stramzik-1915 .

I have provided a workaround in a new answer.

If you would like to help prioritize making the pagination more flexible, please vote for it in the feedback forum. The below will take you to several links of relevant feature requests.
https://docs.microsoft.com/answers/answers/63759/view.html

0 Votes 0 ·
MartinJaffer-MSFT avatar image
0 Votes"
MartinJaffer-MSFT answered VaibhavSingh-3595 commented

Here is one way to iterate over the pages, @stramzik-1915 .

Step 1, make a web call to find out how many total pages.

Step 2, using an array type variable, enumerate (make a list of numbers) the page numbers
17746-image.png

Step 3, use a for-each loop to copy the pages in parallel.
17784-image.png

and the inner activity is a copy activity. We pass the iteration's page number to the parameterized dataset using @{items()}
17730-image.png
17811-image.png

Because each page is now a separate copy activity, we need to parameterize the sink dataset so each page goes to a separate file, instead of overwriting each other.
17740-image.png
17759-image.png



image.png (98.4 KiB)
image.png (90.4 KiB)
image.png (72.6 KiB)
image.png (75.0 KiB)
image.png (85.1 KiB)
image.png (99.0 KiB)
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@MartinJaffer-MSFT That works like a charm.

Thank you very much for your help :)

0 Votes 0 ·

Hello, what are the properties that has pass in web activity.

Please share in detail. @MartinJaffer-MSFT

0 Votes 0 ·

Hi can you please show how you have declared the range variable inside the pipeline because when I'm trying to declare it as an array variable my pipeline fails to work. It fetches the total_pages through a string type variable but not with an array variable.

0 Votes 0 ·
stramzik-1915 avatar image
0 Votes"
stramzik-1915 answered stramzik-1915 published

No answer replied by mistake

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

DineshMadhup-5027 avatar image
0 Votes"
DineshMadhup-5027 answered

Not sure here how web activity counts the total number of pages

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MartinJaffer-MSFT avatar image
0 Votes"
MartinJaffer-MSFT answered

anonymous user , @AnjaliMaithani-7687
Follow up information:

The example I used, was a web activit GET call to https://reqres.in/api/users

The page metadata was included in the output, see below.

 Output
 {
     "page": 1,
     "per_page": 6,
     "total": 12,
     "total_pages": 2,
     "data": [
         {
             "id": 1,
             "email": "george.bluth@reqres.in",
             "first_name": "George",
             "last_name": "Bluth",
             "avatar": "https://reqres.in/img/faces/1-image.jpg"
         },
         {
             "id": 2,
             "email": "janet.weaver@reqres.in",
             "first_name": "Janet",
             "last_name": "Weaver",
             "avatar": "https://reqres.in/img/faces/2-image.jpg"
         },
         ...

Then I just extracted the value with

 @{activity('Web1').output.total_pages}



5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MonikaDubey-7557 avatar image
0 Votes"
MonikaDubey-7557 answered SamSam-3705 commented

In case your API already uses filters such as page and page_size, you may give @item() in the value for page in the copy activity as shown below and you may not need to use pagenumber parameter in API source dataset.
102754-paginatiohn.png



paginatiohn.png (73.8 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@MonikaDubey-7557 Hi Monika, Need your kind help in achieving this, I am facing issues. Can you please help, I tried above methods from MartinJaffer-MSFT , but it dint work, can you please help

0 Votes 0 ·