Apache Hudi 1.0.0源码编译
编译Hudi1.0.0
1、Maven软件下载
https://maven.apache.org/download.cgi
地址:https://dlcdn.apache.org/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz
wget https://dlcdn.apache.org/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gztar -zxvf apache-maven-3.9.9-bin.tar.gz2、添加mvn环境变量
编辑环境变量
vi /etc/profile环境变量添加
export MAVEN_HOME=/usr/local/soft/apache-maven-3.9.9 export PATH=$PATH:$MAVEN_HOME/bin使环境变量生效
source /etc/profile3、添加Maven镜像
/usr/local/soft/apache-maven-3.9.9/conf/settings.xml
两个都需要,只有阿里云有些库下载不了
<mirror> <id>alimaven</id> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <mirrorOf>central</mirrorOf> </mirror> <mirror> <id>confluent</id> <name>confluent maven</name> <url>http://packages.confluent.io/maven/</url> <mirrorOf>confluent</mirrorOf> </mirror>4、验证mvn
mvn -v
5、下载hudi 1.0.0
hudi下载地址
Download | Apache Hudi
或Index of /hudi/1.0.0
下载
wget https://downloads.apache.org/hudi/1.0.0/hudi-1.0.0.src.tgz6、解压hudi
tar -zxvf hudi-1.0.0.src.tgz7、修改hudi源码
a、 修改/usr/local/soft/hudi-1.0.0/hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/testutils/HiveTestUtil.java文件第250行,把 zkServer.shutdown(true);改为 zkServer.shutdown();
b、修改/usr/local/soft/hudi-1.0.0/pom.xml,注释或去掉410行内容
cd /usr/local/soft wget http://packages.confluent.io/archive/5.5/confluent-5.5.0-2.12.zip unzip confluent-5.5.0-2.12.zip cd confluent-5.5.0/ mvn install:install-file -DgroupId=io.confluent -DartifactId=common-config -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/confluent-common/common-config-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=ommon-utils -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/confluent-common/ommon-utils-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=common-utils -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/confluent-common/common-utils-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=kafka-avro-serializer -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/kafka-rest/kafka-avro-serializer-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=kafka-schema-registry-client -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/kafka-rest/kafka-schema-registry-client-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=kafka-json-schema-serializer -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/kafka-rest/kafka-json-schema-serializer-5.5.0.jarc、修改pom添加如下内容
/usr/local/soft/hudi-1.0.0/packaging/hudi-spark-bundle/pom.xml
/usr/local/soft/hudi-1.0.0/packaging/hudi-utilities-bundle/pom.xml
<!-- 增加hudi配置版本的jetty --> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-server</artifactId> <version>${jetty.version}</version> </dependency> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-util</artifactId> <version>${jetty.version}</version> </dependency> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-webapp</artifactId> <version>${jetty.version}</version> </dependency> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-http</artifactId> <version>${jetty.version}</version> </dependency>8、编译hudi
cd hudi-1.0.0mvn clean package -DskipTests -Dspark3.5 -Dflink1.20 -Dscala-2.12 -Dhadoop.version=3.4.0 -Pflink-bundle-shade-hive3或
mvn clean package -DskipTests -Dspark3.4 -Dflink1.14 -Dscala-2.12 -Dhadoop.version=3.1.1 -Pflink-bundle-shade-hive3参考:
CDP集成Hudi-编译部署-CSDN博客
大数据之数据湖Apache Hudi-CSDN博客
