Hello @Surendiran Balasubramanian ,
Thanks for the question and using MS Q&A platform.
As we understand the ask here is to remove the "\" from the JSON , please do let us know if its not accurate.
You need to remove the to_json function and it should work fine .
Existing code
df.select(to_json(struct("author", "title", "pages", "email")).alias("json-data")).agg(collect_list("json-data").alias("list-item"))
Update this to
df=df.select(struct("author", "title", "pages", "email").alias("json-data")).agg(collect_list("json-data").alias("list-item"))
I have tested the output and looks like this .
{"list-item":[{"author":"author1","title":"title1","pages":1,"email":"author1@Stuff .com"},{"author":"author2","title":"title2","pages":2,"email":"author2@Stuff .com"},{"author":"author3","title":"title3","pages":3,"email":"author3@Stuff .com"},{"author":"author4","title":"title4","pages":4,"email":"author4@Stuff .com"}],"version":1}
Please do let me if you have any queries.
Thanks
Himanshu
- Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators