Hello Sudhan Roshan,
Here a few options that I've working on before, take on consideration your Databricks Runtime and Spark cluster version before choose wich one you want to implement.
Azure Databricks Monitoring Library:
- Streams Spark-level events and metrics to Azure Monitor.
- No application code modifications required.
- Includes features like single-line enablement and authentication. Notes: Specific to Azure Databricks Runtimes 10.x (Spark 3.2.x) and earlier, Use the updated version for Runtimes 11.0 (Spark 3.3.x) and above.
Custom Metric Monitoring:
- Create custom application metrics.
- Integrates with Azure Monitor.
- Provides flexibility for specific use cases. Note: Requires additional development effort.
OpenTelemetry:
- OpenTelemetry is an open-source project that provides a unified approach to instrumenting applications for collecting telemetry data (logs, metrics, and traces). Note: Requires additional development effort.
Third-Party Tools such as: Prometheus, Jaeger, Zipkin
References:
- https://github.com/mspnp/spark-monitoring/tree/l4jv2
- https://cloudblogs.microsoft.com/opensource/2019/05/23/announcing-opentelemetry-cncf-merged-opencensus-opentracing/
- https://github.com/mspnp/spark-monitoring?tab=readme-ov-file
- https://techcommunity.microsoft.com/t5/azure-observability-blog/making-azure-the-best-place-to-observe-your-apps-with/ba-p/3995896
- https://github.com/open-telemetry
- https://learn.microsoft.com/en-us/azure/architecture/databricks-monitoring/application-logs
If the information helped address your question, please Accept the answer.
Luis