What are the ways in which a data lake can be organized? This is in reference to the Zones and access.
First Option:
RawZone (Container)\Retail\Full
RawZone (Container)\Retail\Incr
StageZone (Container)\Retail\Full
StageZone (Container)\Retail\Incr
CuratedZone (Container)\Retail\Global
RawZone (Container)\Wholesale\Full
RawZone (Container)\ Wholesale\Incr
RawZone (Container)\ Wholesale\ITD
StageZone (Container)\Wholesale\Full
StageZone (Container)\Wholesale\Incr
CuratedZone (Container)\WholeSale\Global
Second Option:
Retail(Container)\RawZone\Full
Retail(Container)\RawZone\Incr
Retail(Container)\RawZone\ITD
Retail(Container)\StageZone\Full
Retail(Container)\GoldZone\Global
Wholesale(Container)\RawZone\Full
Wholesale(Container)\RawZone\Incr
Wholesale(Container)\RawZone\ITD
Wholesale(Container)\GoldZone\Global
SubjectAreas are– Retail, Wholesale, Online, Lease … Rentals.
In the first option, data files will be organized Starting with the Zones
In the second option, data files will be organized Starting with the subject area
The second option is very specific to a SubjectArea and the access rights can be easily assigned to specific groups but the issue is there are many RawZone as many subject areas (same applies to other zones as well). In the first approach, applying RBAC at the Zone level results in the rights getting inherited to the folders resulting in users not belonging to Retail getting access to WholeSale which means we have to use ACLs (?) to remove the unwanted permissions.
Thanks,
grajee