Grepping through blob storage from a Azure function written in Python

Jon Mart 136 Reputation points
2020-10-16T18:01:06.333+00:00

Let's say I had this CSV file in blob storage:

name,place,rank

jon,sc,5
rob,nc,6
tom,tx,7

I want to write a HTTP triggered python function in Azure to take user input and basically grep out the correct line based on user input. For example, if the user passes "jon" to the function through uri, I want the function to return.....

jon,sc,5

Example code would be outstanding.

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,263 questions
Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,687 questions
{count} votes

Accepted answer
  1. MayankBargali-MSFT 68,476 Reputation points
    2020-10-19T07:51:29.85+00:00

    Hi @Jon Mart

    There can be multiple ways how you can read the CSV file inside the function app. I have used the pandas for reading and formatting. You can use the below code for reference and you need to modify it according to the desired output. In requirement.txt I have added the below line so the azure storage and pandas package is available in my function.

    pandas
    azure-storage-blob

    Some of the values are hardcode but you can use the user input to search the value for particular columns. Please utilize the app setting rather than hard coding the connecting strings.

    You can refer to pandas reference document if you want any changes to your output. You can also refer to the azure storage python document and example if you want to modify any functionality.

    Input:

    Identifier;AccessCode;RecoveryCode;First name;Last name;Department;Location  
    9012;12se74;rb9012;Rachel;Booker;Sales;Manchester  
    2070;04ap67;lg2070;Laura;Grey;Depot;London  
    4081;30no86;cj4081;Craig;Johnson;Depot;London  
    9346;14ju73;mj9346;Mary;Jenkins;Engineering;Manchester  
    5079;09ja61;js5079;Jamie;Smith;Engineering;Manchester  
    

    Python Function:

    import logging  
    from azure.storage.blob import BlobClient  
    import pandas as pd   
    import os, io  
      
    import azure.functions as func  
      
      
    def main(req: func.HttpRequest) -> func.HttpResponse:  
        logging.info('Python HTTP trigger function processed a request.')  
      
        connection_string = "yourconnectionstring"  
        blobName = "yourblobname"  
        containerName = "yourcontainorname"  
        blob = BlobClient.from_connection_string(conn_str=connection_string, container_name=containerName, blob_name=blobName)  
      
        blobStream = blob.download_blob().content_as_bytes()  
      
        logging.info(blobStream)  
        df = pd.read_csv(io.BytesIO(blobStream), sep=';', dtype=str)  
        logging.info(df)  
      
        df.iloc[:,:].to_string(header=False, index=False)  
      
        #Displaying output of the csv file  
        logging.info(df)  
      
        #Searching for column Location name as London  
        result = df[df["Location"]=="London"]  
          
        #Displaying it as json content  
        #{"Identifier":{"1":"2070","2":"4081"},"AccessCode":{"1":"04ap67","2":"30no86"},"RecoveryCode":{"1":"lg2070","2":"cj4081"},"First name":{"1":"Laura","2":"Craig"},"Last name":{"1":"Grey","2":"Johnson"},"Department":{"1":"Depot","2":"Depot"},"Location":{"1":"London","2":"London"}}  
        logging.info(result.to_json())  
          
        #Returing the output as json and removing the index  
        #{"columns":["Identifier","AccessCode","RecoveryCode","First name","Last name","Department","Location"],"data":[["2070","04ap67","lg2070","Laura","Grey","Depot","London"],["4081","30no86","cj4081","Craig","Johnson","Depot","London"]]}  
        logging.info(result.to_json(orient='split',index=False))  
          
        #Returing the output as string and removing the index and removing the index  
        # 2070  04ap67  lg2070  Laura  Grey  Depot  London  
        # 4081  30no86  cj4081  Craig  Johnson  Depot  London  
        logging.info(result.to_string(header=False, index=False))  
      
        output = result.to_string(header=False, index=False,index_names=False)  
          
        return func.HttpResponse(  
                 output,  
                 status_code=200  
            )  
    

    Hope the above helps and you can modify the code as per your need.
    Please 'Accept as answer' and ‘Upvote’ if it helped so that it can help others in the community looking for help on similar topics.

    0 comments No comments

0 additional answers

Sort by: Most helpful