I solved it, thank you.
Pipeline does not run new data
Hi -
I created and published a pipeline that pulls data from an Azure SQL table, processes, models and then appends the output to an Azure SQL table. The Azure SQL table is updated with new data every day or two. In my script, I want to model on data that has been added two days before today with the following script:
from datetime import date, timedelta
yesterday = date.today() - timedelta(days=2)
yesterday.strftime("%Y-%m-%d")
print(yesterday)
keep data that is 2 days ago only
data_prior = data[data['MatterOpenDate'] == str(yesterday)]
print(data_prior.head())
while True:
answer = data_prior.empty
if answer == False:
print('Continue Process')
break
elif answer == True:
print('Empty dataset')
run.complete()
exit()
When I first ran my pipeline it worked great. I published this experiment, etc. and created a reoccurring schedule to run once a day every day.
BUT the schedule continues to run the exact same data as the original run even when there is new data being uploaded. Why and what do I need to do for the script to run 'naturally' as written?
Thank you