DeltaSharing en Databricks – DEV Community
December 26, 2024

DeltaSharing en Databricks – DEV Community

Let me tell you, in the last sprint we were working on a project where we had a new requirement where we wanted to expose tables from the Databricks catalog to an external service.
Normally this process is done from Databricks to databricks, but this requirement is new to this project.

Solution, Delta Sharing, Let’s talk about this protocol before showing you how it is solved.

In today’s data-driven world, secure and seamless data sharing between organizations and platforms is critical. Delta Sharing is an open protocol developed by Databricks to meet this need by enabling secure and efficient data sharing. The protocol allows data providers to share instant data directly with consumers without the need for complex data pipelines or data replication.

Delta Sharing leverages the power of Delta Lake to ensure shared data is always up to date and consistent. It supports multiple data formats and integrates seamlessly with various data tools and platforms, making it a versatile solution for modern data collaboration.

In this article, we’ll explore the main features of Delta Sharing, its benefits, and how to start implementing it in a Databricks environment. Whether you are a data provider looking to share datasets, or a data consumer looking to easily access shared data, Delta Sharing offers powerful and scalable solutions to meet your needs.

Now what is the purpose of our coming.

First we have to create a share:

CREATE SHARE IF NOT EXISTS recipiente_share;
Enter full screen mode

Exit full screen mode

Once created, we can see everything created using this code:

SHOW SHARES
Enter full screen mode

Exit full screen mode

Then you need to create a recipient:

CREATE RECIPIENT IF NOT EXISTS BigQueryDataConsumer
COMMENT "delta Sharing With BigQuery"
Enter full screen mode

Exit full screen mode

We can see all recipients created:

SHOW RECIPIENTS;
Enter full screen mode

Exit full screen mode

It is necessary to grant query permission to this recipient:

GRANT SELECT 
ON SHARE recipiente_share
TO RECIPIENT BigQueryDataConsumer
Enter full screen mode

Exit full screen mode

After creating the recipient and obtaining the necessary permissions, we can see its details:

DESCRIBE RECIPIENT bigquerydataconsumer
Enter full screen mode

Exit full screen mode

It logs the details there, but for practice the most important is “activation_link”:

This url will give us an archive containing the token and the endpoint to the table:

We will use this information to connect different services.

Thanks!

2024-12-26 13:59:31

Leave a Reply

Your email address will not be published. Required fields are marked *