Skip to content

Commit

Permalink
Refactored ADLS set access control and added builders for different t…
Browse files Browse the repository at this point in the history
…ypes (Azure#6173)
  • Loading branch information
gapra-msft authored Nov 6, 2019
1 parent fc35074 commit d87ec68
Show file tree
Hide file tree
Showing 124 changed files with 9,397 additions and 4,707 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -550,6 +550,7 @@
<Or>
<Class name="~.*JavaDoc(CodeSnippets|CodeSamples|Samples)"/>
<Class name="com.azure.storage.blob.batch.ReadmeCodeSamples"/>
<Class name="com.azure.storage.file.datalake.GetSetAccessControlExample"/>
</Or>
<Bug pattern="DLS_DEAD_LOCAL_STORE,
URF_UNREAD_FIELD,
Expand Down
101 changes: 91 additions & 10 deletions sdk/storage/azure-storage-file-datalake/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,17 +146,35 @@ b. Use the connection string.

## Key concepts

This preview package for Java includes ADLS Gen2 specific API support made available in Blob SDK. This includes:
1. New directory level operations (Create, Rename/Move, Delete) for both hierarchical namespace enabled (HNS) storage accounts and HNS disabled storage accounts. For HNS enabled accounts, the rename/move operations are atomic.
2. Permission related operations (Get/Set ACLs) for hierarchical namespace enabled (HNS) accounts.
DataLake Storage Gen2 was designed to:
- Service multiple petabytes of information while sustaining hundreds of gigabits of throughput
- Allow you to easily manage massive amounts of data

HNS enabled accounts in ADLS Gen2 can also now leverage all of the operations available in Blob SDK. Support for File level semantics for ADLS Gen2 is planned to be made available in Blob SDK in a later release. In the meantime, please find below mapping for ADLS Gen2 terminology to Blob terminology
Key Features of DataLake Storage Gen2 include:
- Hadoop compatible access
- A superset of POSIX permissions
- Cost effective in terms of low-cost storage capacity and transactions
- Optimized driver for big data analytics

|ADLS Gen2 | Blob |
| ---------- | ---------- |
|Filesystem | Container |
|Folder | Directory |
|File | Blob |
A fundamental part of Data Lake Storage Gen2 is the addition of a hierarchical namespace to Blob storage. The hierarchical namespace organizes objects/files into a hierarchy of directories for efficient data access.

In the past, cloud-based analytics had to compromise in areas of performance, management, and security. Data Lake Storage Gen2 addresses each of these aspects in the following ways:
- Performance is optimized because you do not need to copy or transform data as a prerequisite for analysis. The hierarchical namespace greatly improves the performance of directory management operations, which improves overall job performance.
- Management is easier because you can organize and manipulate files through directories and subdirectories.
- Security is enforceable because you can define POSIX permissions on directories or individual files.
- Cost effectiveness is made possible as Data Lake Storage Gen2 is built on top of the low-cost Azure Blob storage. The additional features further lower the total cost of ownership for running big data analytics on Azure.

Data Lake Storage Gen2 offers two types of resources:

- The _filesystem used via 'DataLakeFileSystemClient'
- The _path used via 'DataLakeFileClient' or 'DataLakeDirectoryClient'

|ADLS Gen2 | Blob |
| --------------------------| ---------- |
|Filesystem | Container |
|Path (File or Directory) | Blob |

Note: This client library does not support hierarchical namespace (HNS) disabled storage accounts.

## Examples

Expand All @@ -165,6 +183,7 @@ The following sections provide several code snippets covering some of the most c
- [Create a `DataLakeServiceClient`](#create-a-datalakeserviceclient)
- [Create a `DataLakeFileSystemClient`](#create-a-filesystemclient)
- [Create a `DataLakeFileClient`](#create-a-fileclient)
- [Create a `DataLakeDirectoryClient`](#create-a-directoryclient)
- [Create a file system](#create-a-filesystem)
- [Upload a file from a stream](#upload-a-file-from-a-stream)
- [Read a file to a stream](#read-a-file-to-a-stream)
Expand Down Expand Up @@ -223,6 +242,27 @@ DataLakeFileClient fileClient = new DataLakePathClientBuilder()
.buildClient();
```

### Create a `DataLakeDirectoryClient`

Get a `DataLakeDirectoryClient` using a `DataLakeFileSystemClient`.

```java
DataLakeDirectoryClient directoryClient = dataLakeFileSystemClient.getDirectoryClient("mydir");
```

or

Create a `DirectoryClient` from the builder [`sasToken`](#get-credentials) generated above.

```java
DataLakeDirectoryClient directoryClient = new DataLakePathClientBuilder()
.endpoint("<your-storage-dfs-url>")
.sasToken("<your-sasToken>")
.fileSystemName("myfilesystem")
.pathName("mydir")
.buildClient();
```

### Create a file system

Create a file system using a `DataLakeServiceClient`.
Expand All @@ -233,7 +273,7 @@ dataLakeServiceClient.createFileSystem("myfilesystem");

or

Create a container using a `DataLakeFileSystemClient`.
Create a file system using a `DataLakeFileSystemClient`.

```java
dataLakeFileSystemClient.create();
Expand Down Expand Up @@ -274,6 +314,46 @@ dataLakeFileSystemClient.listPaths()
);
```

### Rename a file

Rename a file using a `DataLakeFileClient`.

```java
DataLakeFileClient fileClient = dataLakeFileSystemClient.getFileClient("myfile");
fileClient.create();
fileClient.rename("new-file-name")
```

### Rename a directory

Rename a directory using a `DataLakeDirectoryClient`.

```java
DataLakeDirectoryClient directoryClient = dataLakeFileSystemClient.getDirectoryClient("mydir");
directoryClient.create();
directoryClient.rename("new-directory-name")
```

### Get file properties

Get properties from a file using a `DataLakeFileClient`.

```java
DataLakeFileClient fileClient = dataLakeFileSystemClient.getFileClient("myfile");
fileClient.create();
PathProperties properties = fileClient.getProperties();
```

### Get directory properties

Get properties from a directory using a `DataLakeDirectoryClient`.

```java
DataLakeDirectoryClient directoryClient = dataLakeFileSystemClient.getDirectoryClient("mydir");
directoryClient.create();
PathProperties properties = directoryClient.getProperties();
```

### Authenticate with Azure Identity

The [Azure Identity library][identity] provides Azure Active Directory support for authenticating with Azure Storage.
Expand Down Expand Up @@ -317,6 +397,7 @@ This project has adopted the [Microsoft Open Source Code of Conduct][coc]. For m
[storage_account_create_portal]: https://docs.microsoft.com/azure/storage/common/storage-quickstart-create-account?tabs=azure-portal
[identity]: https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/identity/azure-identity/README.md
[samples]: src/samples
[error_codes]: https://docs.microsoft.com/en-us/rest/api/storageservices/data-lake-storage-gen2
[cla]: https://cla.microsoft.com
[coc]: https://opensource.microsoft.com/codeofconduct/
[coc_faq]: https://opensource.microsoft.com/codeofconduct/faq/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,17 @@
import com.azure.storage.file.datalake.implementation.models.SourceModifiedAccessConditions;
import com.azure.storage.file.datalake.models.DataLakeRequestConditions;
import com.azure.storage.file.datalake.models.PathAccessControl;
import com.azure.storage.file.datalake.models.PathAccessControlEntry;
import com.azure.storage.file.datalake.models.PathHttpHeaders;
import com.azure.storage.file.datalake.models.PathInfo;
import com.azure.storage.file.datalake.models.PathItem;
import com.azure.storage.file.datalake.models.PathPermissions;
import com.azure.storage.file.datalake.models.PathProperties;
import reactor.core.publisher.Mono;

import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.List;
import java.util.Map;
import java.util.Objects;

Expand Down Expand Up @@ -412,54 +415,108 @@ public Mono<Response<PathProperties>> getPropertiesWithResponse(DataLakeRequestC
}

/**
* Changes the access control for a resource.
* Changes the access control list, group and/or owner for a resource.
*
* <p><strong>Code Samples</strong></p>
*
* {@codesnippet com.azure.storage.file.datalake.DataLakePathAsyncClient.setAccessControl#PathAccessControl}
* {@codesnippet com.azure.storage.file.datalake.DataLakePathAsyncClient.setAccessControlList#List-String-String}
*
* <p>For more information, see the
* <a href="https://docs.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update">Azure Docs</a></p>
*
* @param accessControl {@link PathAccessControl}
* @param accessControlList A list of {@link PathAccessControlEntry} objects.
* @param group The group of the resource.
* @param owner The owner of the resource.
* @return A reactive response containing the resource info.
*/
public Mono<PathInfo> setAccessControl(PathAccessControl accessControl) {
public Mono<PathInfo> setAccessControlList(List<PathAccessControlEntry> accessControlList, String group,
String owner) {
try {
return setAccessControlWithResponse(accessControl, null).flatMap(FluxUtil::toMono);
return setAccessControlListWithResponse(accessControlList, group, owner, null).flatMap(FluxUtil::toMono);
} catch (RuntimeException ex) {
return monoError(logger, ex);
}
}

/**
* Changes the access control list, group and/or owner for a resource.
*
* <p><strong>Code Samples</strong></p>
*
* {@codesnippet com.azure.storage.file.datalake.DataLakePathAsyncClient.setAccessControlListWithResponse#List-String-String-DataLakeRequestConditions}
*
* <p>For more information, see the
* <a href="https://docs.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update">Azure Docs</a></p>
*
* @param accessControlList A list of {@link PathAccessControlEntry} objects.
* @param group The group of the resource.
* @param owner The owner of the resource.
* @param accessConditions {@link DataLakeRequestConditions}
* @return A reactive response containing the resource info.
*/
public Mono<Response<PathInfo>> setAccessControlListWithResponse(List<PathAccessControlEntry> accessControlList,
String group, String owner, DataLakeRequestConditions accessConditions) {
try {
return withContext(context -> setAccessControlWithResponse(accessControlList, null, group, owner,
accessConditions, context));
} catch (RuntimeException ex) {
return monoError(logger, ex);
}
}

/**
* Changes the access control for a resource.
* Changes the permissions, group and/or owner for a resource.
*
* <p><strong>Code Samples</strong></p>
*
* {@codesnippet com.azure.storage.file.datalake.DataLakePathAsyncClient.setAccessControlWithResponse#PathAccessControl-DataLakeRequestConditions}
* {@codesnippet com.azure.storage.file.datalake.DataLakePathAsyncClient.setPermissions#PathPermissions-String-String}
*
* <p>For more information, see the
* <a href="https://docs.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update">Azure Docs</a></p>
*
* @param accessControl {@link PathAccessControl}
* @param permissions {@link PathPermissions}
* @param group The group of the resource.
* @param owner The owner of the resource.
* @return A reactive response containing the resource info.
*/
public Mono<PathInfo> setPermissions(PathPermissions permissions, String group, String owner) {
try {
return setPermissionsWithResponse(permissions, group, owner, null).flatMap(FluxUtil::toMono);
} catch (RuntimeException ex) {
return monoError(logger, ex);
}
}

/**
* Changes the permissions, group and/or owner for a resource.
*
* <p><strong>Code Samples</strong></p>
*
* {@codesnippet com.azure.storage.file.datalake.DataLakePathAsyncClient.setPermissionsWithResponse#PathPermissions-String-String-DataLakeRequestConditions}
*
* <p>For more information, see the
* <a href="https://docs.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update">Azure Docs</a></p>
*
* @param permissions {@link PathPermissions}
* @param group The group of the resource.
* @param owner The owner of the resource.
* @param accessConditions {@link DataLakeRequestConditions}
* @return A reactive response containing the resource info.
*/
public Mono<Response<PathInfo>> setAccessControlWithResponse(PathAccessControl accessControl,
public Mono<Response<PathInfo>> setPermissionsWithResponse(PathPermissions permissions, String group, String owner,
DataLakeRequestConditions accessConditions) {
try {
return withContext(context -> setAccessControlWithResponse(accessControl, accessConditions, context));
return withContext(context -> setAccessControlWithResponse(null, permissions, group, owner,
accessConditions, context));
} catch (RuntimeException ex) {
return monoError(logger, ex);
}
}

Mono<Response<PathInfo>> setAccessControlWithResponse(PathAccessControl accessControl,
DataLakeRequestConditions accessConditions, Context context) {
Mono<Response<PathInfo>> setAccessControlWithResponse(List<PathAccessControlEntry> accessControlList,
PathPermissions permissions, String group, String owner, DataLakeRequestConditions accessConditions,
Context context) {

Objects.requireNonNull(accessControl, "accessControl can not be null");
accessConditions = accessConditions == null ? new DataLakeRequestConditions() : accessConditions;

LeaseAccessConditions lac = new LeaseAccessConditions().setLeaseId(accessConditions.getLeaseId());
Expand All @@ -469,8 +526,14 @@ Mono<Response<PathInfo>> setAccessControlWithResponse(PathAccessControl accessCo
.setIfModifiedSince(accessConditions.getIfModifiedSince())
.setIfUnmodifiedSince(accessConditions.getIfUnmodifiedSince());

return this.dataLakeStorage.paths().setAccessControlWithRestResponseAsync(null, accessControl.getOwner(),
accessControl.getGroup(), accessControl.getPermissions(), accessControl.getAcl(), null, lac, mac, context)
String permissionsString = permissions == null ? null : permissions.toString();
String accessControlListString =
accessControlList == null
? null
: PathAccessControlEntry.serializeList(accessControlList);

return this.dataLakeStorage.paths().setAccessControlWithRestResponseAsync(null, owner, group, permissionsString,
accessControlListString, null, lac, mac, context)
.map(response -> new SimpleResponse<>(response, new PathInfo(response.getDeserializedHeaders().getETag(),
response.getDeserializedHeaders().getLastModified())));
}
Expand Down Expand Up @@ -532,11 +595,10 @@ Mono<Response<PathAccessControl>> getAccessControlWithResponse(boolean returnUpn

return this.dataLakeStorage.paths().getPropertiesWithRestResponseAsync(
PathGetPropertiesAction.GET_ACCESS_CONTROL, returnUpn, null, null, lac, mac, context)
.map(response -> new SimpleResponse<>(response, new PathAccessControl()
.setAcl(response.getDeserializedHeaders().getAcl())
.setGroup(response.getDeserializedHeaders().getGroup())
.setOwner(response.getDeserializedHeaders().getOwner())
.setPermissions(response.getDeserializedHeaders().getPermissions())));
.map(response -> new SimpleResponse<>(response, new PathAccessControl(
PathAccessControlEntry.parseList(response.getDeserializedHeaders().getAcl()),
PathPermissions.parseSymbolic(response.getDeserializedHeaders().getPermissions()),
response.getDeserializedHeaders().getGroup(), response.getDeserializedHeaders().getOwner())));
}

/**
Expand Down
Loading

0 comments on commit d87ec68

Please sign in to comment.