Hi,
I am using cognitive service from mmlspark package to extract keyphrases, I have dataset with ~500k (5 lakh records), its taking too long (job runs for more than 24 hrs) to extract keyphrases, is there any faster or efficient way to extract key phrases for huge dataset.
keyphrase = (KeyPhraseExtractor()
.setTextCol("text")
.setLocation("eastus")
.setSubscriptionKey(service_key)
.setOutputCol("keyphrase")
)
results = keyphrase.transform(df_cleaned)
I am running the job on Synapse notebook on spark cluster.
Thanks,
Divya

or upvote
button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is