Pipeline does not run new data

Wozniak, Joanna 1 Reputation point
2021-10-02T16:38:49.123+00:00

Hi -
I created and published a pipeline that pulls data from an Azure SQL table, processes, models and then appends the output to an Azure SQL table. The Azure SQL table is updated with new data every day or two. In my script, I want to model on data that has been added two days before today with the following script:

from datetime import date, timedelta
yesterday = date.today() - timedelta(days=2)
yesterday.strftime("%Y-%m-%d")
print(yesterday)

keep data that is 2 days ago only

data_prior = data[data['MatterOpenDate'] == str(yesterday)]
print(data_prior.head())

while True:
answer = data_prior.empty
if answer == False:
print('Continue Process')
break
elif answer == True:
print('Empty dataset')
run.complete()
exit()

When I first ran my pipeline it worked great. I published this experiment, etc. and created a reoccurring schedule to run once a day every day.

BUT the schedule continues to run the exact same data as the original run even when there is new data being uploaded. Why and what do I need to do for the script to run 'naturally' as written?

Thank you

Azure SQL Database
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,572 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,603 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Wozniak, Joanna 1 Reputation point
    2021-10-09T14:54:22.36+00:00

    I solved it, thank you.

    0 comments No comments