This project bundles the minimal dependencies from Hadoop's FileSystem abstraction and shades them to avoid dependency conflicts.
This project is the basis for the bundled File System adapters that are based on Hadoop code, but keep the appearance of Flink being Hadoop-free, from a dependency perspective.
For this to work, however, we needed to adapt Hadoop's Configuration
class to load a (shaded) core-default-shaded.xml
configuration with the
relocated class names of classes loaded via reflection.
If you want to change the Hadoop version this project depends on, the following steps are required to keep the shading correct:
- from the respective Hadoop jar (currently 3.1.0),
- copy
org/apache/hadoop/conf/Configuration.java
tosrc/main/java/org/apache/hadoop/conf/
and- replace
core-default.xml
withcore-default-shaded.xml
.
- replace
- copy
org/apache/hadoop/util/NativeCodeLoader.java
tosrc/main/java/org/apache/hadoop/util/
and- replace the native method stubs as in the current setup (empty methods, or return false)
- copy
core-default.xml
tosrc/main/resources/core-default-shaded.xml
and- change every occurrence of
org.apache.hadoop
intoorg.apache.flink.fs.shaded.hadoop3.org.apache.hadoop
- change every occurrence of
- copy
core-site.xml
tosrc/test/resources/core-site.xml
(as is)
- verify the shaded jar:
- does not contain any unshaded classes
- all other classes should be under
org.apache.flink.fs.shaded.hadoop3
- there should be a
META-INF/services/org.apache.flink.core.fs.FileSystemFactory
file pointing to two classes:org.apache.flink.fs.s3hadoop.S3FileSystemFactory
andorg.apache.flink.fs.s3hadoop.S3AFileSystemFactory
- other service files under
META-INF/services
should have their names and contents in the relocatedorg.apache.flink.fs.s3hadoop.shaded
package - contains a
core-default-shaded.xml
file - does not contain a
core-default.xml
orcore-site.xml
file