Skip to content

Commit

Permalink
galileo first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
JDGalileo committed Sep 9, 2021
0 parents commit 4f2e188
Show file tree
Hide file tree
Showing 414 changed files with 46,333 additions and 0 deletions.
15 changes: 15 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
__pycache__/
_ext/
build/
dist/
.cache/
.eggs/
*.egg-info/
.coverage*
*.so*
.*.swp
.vscode/
.DS_Store
.models/
.logs/
.data/
22 changes: 22 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
notifications:
email: false

jobs:
include:
- sudo: required
services:
- docker
env: DOCKER_IMAGE=jdgalileo/galileo:devel-cpu
BUILD_TARGET=cpu
- sudo: required
services:
- docker
env: DOCKER_IMAGE=jdgalileo/galileo:devel-gpu
BUILD_TARGET=gpu

install:
- docker pull $DOCKER_IMAGE

script:
- docker run --rm -e BUILD_TARGET=$BUILD_TARGET -v `pwd`:/workspace $DOCKER_IMAGE bash /workspace/docker/build_wheel.sh
- ls /workspace/dist
409 changes: 409 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

48 changes: 48 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<div align="center">
<img src="docs/imgs/logo.jpg" height="240" />
</div>

[![Build Status](https://travis-ci.org/JDGalileo/galileo.svg?branch=main)](https://travis-ci.org/JDGalileo/galileo)
[![PyPI version](https://badge.fury.io/py/jdgalileo.svg)](https://badge.fury.io/py/jdgalileo)
[![Anaconda-Server Badge](https://anaconda.org/jdgalileo/jdgalileo/badges/version.svg)](https://anaconda.org/jdgalileo/jdgalileo)

近年来,图计算在搜索、推荐和风控等场景中获得显著的效果,但也面临超大规模异构图训练,与现有的深度学习框架Tensorflow和PyTorch结合等难题。

Galileo(伽利略)是一个图深度学习框架,具备超大规模、易使用、易扩展、高性能、双后端等优点,旨在解决超大规模图算法在工业级场景的落地难题,提供图神经网络和图嵌入等模型的训练评估及预测能力。

# 架构介绍

<div align="center">
<img src="docs/imgs/arch.jpg" height="450" /><br/>
Galileo整体架构
</div>

Galileo图深度学习框架采用分层设计理念,主要分为分布式图引擎、图多后端框架、图模型三层。
- **分布式高性能图引擎**:采用紧凑高效的内存结构表达图数据,能够以极低内存支持**超大规模异构图**;基于ZeroCopy机制实现全链路调用,高性能图查询和图采样。
- **图多后端框架**:支持Tensorflow和PyTorch双后端,配置化单机分布式训练,支持Keras和Estimator训练,提供统一的图查询和图采样接口,**易扩展**
- **图模型**:遵循数据与模型解耦,提升代码复用性;基于组件化设计,降低模型实现难度,支持Message Passing范式编写图模型,也支持Python直接访问训练后端接口,**易使用且灵活性高**


# 开始使用
我们提供了Galileo的[pip和conda包](docs/pip.md),推荐在[docker镜像](https://hub.docker.com/r/jdgalileo/galileo)中使用Galileo,免去了安装依赖包的烦恼。也可以从[源码编译安装](docs/install.md)Galileo。

阅读[入门教程](docs/introduce.md)开始使用Galileo。

如果Galileo目前实现的[图模型](examples/README.md)无法满足需求,可以[定制化图模型](docs/custom.md)

使用自己的图数据可以参考[图数据准备](docs/data_prepare.md)

如果图数据量大,可以参考[分布式训练](docs/train.md)

想要了解更多Galileo接口参考[API文档](docs/api.md)


# 联系我们
欢迎通过issue和邮件组([email protected])联系我们。

# LICENSE
Galileo图深度学习框架使用Apache License 2.0许可。

# 致谢
Galileo图深度学习框架由京东集团-京东零售-技术与数据中心荣誉出品,在此感谢京东零售算法通道的大力支持,同时感谢商业提升事业部、搜索与推荐平台部等兄弟部门在开发及使用过程中提出的宝贵意见。

10 changes: 10 additions & 0 deletions conda/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/bash

pre=$(ls -d $PIP_CACHE_DIR/../_h_env*)
src_dir=$RECIPE_DIR/..

echo "build $src_dir $PKG_VERSION to $pre"

cd $src_dir
pip install --no-deps --prefix $pre dist/jdgalileo-${PKG_VERSION}-cp38-cp38-linux_x86_64.whl

16 changes: 16 additions & 0 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{% set version = "1.0.0" %}

package:
name: jdgalileo
version: {{ version }}

build:
number: 1
binary_relocation: False

requirements:
run:
- python >=3.8

about:
license: Apache 2.0
2 changes: 2 additions & 0 deletions conda/post-link.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/bash
python -c "import galileo;print(galileo.libs_dir)" > /etc/ld.so.conf.d/galileo.conf && ldconfig
3 changes: 3 additions & 0 deletions conda/pre-unlink.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash
/bin/rm -f /etc/ld.so.conf.d/galileo.conf
ldconfig
22 changes: 22 additions & 0 deletions docker/base.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright 2020 JD.com, Inc. Galileo Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

ARG BASE_IMAGE
ARG INSTALL_CUDA
FROM ${BASE_IMAGE}

COPY base_deps.sh /tmp/
ENV INSTALL_CUDA=${INSTALL_CUDA} MAX_JOBS=16
RUN bash /tmp/base_deps.sh && rm -f /tmp/base_deps.sh
159 changes: 159 additions & 0 deletions docker/base_deps.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
#!/bin/bash
# Copyright 2020 JD.com, Inc. Galileo Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

set -e -u

INSTALL_CUDA=${INSTALL_CUDA:-0}
MAX_JOBS=${MAX_JOBS:-32}
PYPI_URL=https://mirrors.ustc.edu.cn/pypi/web/simple
TORCH_URL=https://download.pytorch.org/whl/torch_stable.html
HADOOP_URL=${HADOOP_URL:-https://downloads.apache.org/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gz}
ZK_BIN_URL=${ZK_BIN_URL:-https://archive.apache.org/dist/zookeeper/zookeeper-3.5.6/apache-zookeeper-3.5.6-bin.tar.gz}
OPENSSL_URL=${OPENSSL_URL:-http://www.openssl.org/source/openssl-1.1.0c.tar.gz}


function install_gcc() {
#for nvidia/cuda:10.1-cudnn7-devel-centos7
GCC_VERSION=8.4.0
GCC_URL=https://mirrors.ustc.edu.cn/gnu/gcc/gcc-${GCC_VERSION}/gcc-${GCC_VERSION}.tar.xz
MAKE_URL=https://mirrors.ustc.edu.cn/gnu/make/make-4.3.tar.gz
CMAKE_URL=https://github.com/Kitware/CMake/releases/download/v3.19.2/cmake-3.19.2-Linux-x86_64.sh

sed -e 's|^mirrorlist=|#mirrorlist=|g' \
-e 's|^#baseurl=http://mirror.centos.org/centos|baseurl=https://mirrors.ustc.edu.cn/centos|g' \
-i.bak /etc/yum.repos.d/CentOS-Base.repo
yum -y install wget which vim
yum -y groupinstall 'Development Tools'
wget -qO make.tar.gz ${MAKE_URL} && tar xf make.tar.gz && rm -f make.tar.gz
pushd make-4.3 && ./configure --prefix=/usr/local/ && make -j${MAX_JOBS}
make install && popd && rm -rf make-4.3 && yum -y remove make
ln -srf /usr/local/bin/make /usr/bin/gmake
ln -srf /usr/local/bin/make /usr/bin/make

wget -qO cmake.sh ${CMAKE_URL} && bash cmake.sh --skip-license --prefix=/usr/local && rm -f cmake.sh
wget -q ${GCC_URL} && tar xf gcc-${GCC_VERSION}.tar.xz && rm -f gcc-${GCC_VERSION}.tar.xz
pushd gcc-${GCC_VERSION} && ./contrib/download_prerequisites
./configure --enable-checking=release --enable-languages=c,c++,obj-c++ --disable-multilib
make -j${MAX_JOBS} && make install && popd && rm -rf gcc-${GCC_VERSION} && yum -y remove gcc
}

function install_ssl() {
yum -y install wget zlib zlib-devel
wget -O openssl-1.1.0c.tar.gz ${OPENSSL_URL}
tar xf openssl-1.1.0c.tar.gz
pushd openssl-1.1.0c
./config shared zlib
make -j${MAX_JOBS}
make install_sw
popd
rm -fr openssl-1.1.0*
yum clean all && rm -rf /var/cache/yum/*
}

function install_zk() {
yum -y install java-1.8.0-openjdk java-1.8.0-openjdk-devel
yum clean all && rm -rf /var/cache/yum/*
echo "export JAVA_HOME=/usr/lib/jvm/java" >> /root/.bashrc
echo "export CLASSPATH=\$(/opt/hadoop/bin/hadoop classpath --glob)" \
>> /root/.bashrc

wget -O hadoop.tar.gz ${HADOOP_URL}
tar -xf hadoop.tar.gz -C /opt
rm -f hadoop.tar.gz
mv /opt/hadoop* /opt/hadoop

wget -O zookeeper.tar.gz ${ZK_BIN_URL}
tar xf zookeeper.tar.gz -C /usr/local/
rm -f zookeeper.tar.gz && mv /usr/local/apache-zookeeper-3.5.6-bin \
/usr/local/zookeeper
mkdir -p /usr/local/zookeeper/data
echo -e "dataDir=/usr/local/zookeeper/data\nclientPort=2181" \
> /usr/local/zookeeper/conf/zoo.cfg
echo "JAVA_HOME=/usr/lib/jvm/java" \
> /usr/local/zookeeper/conf/zookeeper-env.sh
}

function setup_env() {
paths=""
libs="/lib64:/usr/local/lib:/usr/local/lib64"
libs+=":/usr/lib/jvm/java/jre/lib/amd64/server"
libs+=":/opt/hadoop/lib/native"
if [ ${INSTALL_CUDA} -ne 0 ];then
paths+="/usr/local/bin"
paths+=":/usr/local/anaconda3/bin"
paths+=":/usr/local/cuda/bin"
libs+=":/usr/local/anaconda3/lib"
libs+=":/usr/local/cuda/lib64"
libs+=":/usr/local/nvidia/lib"
libs+=":/usr/local/nvidia/lib64"
fi
paths+=":/usr/lib/jvm/java/bin"
paths+=":/usr/local/zookeeper/bin"
paths+=":/opt/hadoop/bin"
echo "export PATH=${paths}:\$PATH" >> /root/.bashrc
echo "export LD_LIBRARY_PATH=${libs}:\$LD_LIBRARY_PATH" >> /root/.bashrc
echo "export LIBRARY_PATH=${libs}:\$LIBRARY_PATH" >> /root/.bashrc
echo "export MAX_JOBS=${MAX_JOBS}" >> /root/.bashrc
set +u
source /root/.bashrc || true
}

function install_py3() {
#for nvidia/cuda:10.1-cudnn7-devel-centos7
ANACONDA_URL=https://repo.anaconda.com/archive/Anaconda3-2020.07-Linux-x86_64.sh
wget -qO anaconda3.sh ${ANACONDA_URL} && bash anaconda3.sh -b -p /usr/local/anaconda3
rm -f anaconda3.sh && cp /usr/local/anaconda3/lib/libstdc++.so.6.0.26 /lib64
ln -srf /lib64/libstdc++.so.6.0.26 /lib64/libstdc++.so.6
ln -srf /lib64/libstdc++.so.6.0.26 /usr/lib64/libstdc++.so.6
}

function install_deps_gpu() {
#for nvidia/cuda:10.1-cudnn7-devel-centos7
pip=/usr/local/anaconda3/bin/pip3
conda=/usr/local/anaconda3/bin/conda
${pip} install -i ${PYPI_URL} pip -U
${pip} config set global.index-url ${PYPI_URL}
${pip} install tensorflow==2.3.0 networkx==2.3 attrs
${pip} install torch==1.6.0+cu101 torchvision==0.7.0+cu101 torch-scatter -f ${TORCH_URL}
${conda} install -y numpy scipy pyyaml ipython mkl mkl-include scikit-learn
${conda} install -c conda-forge -y kazoo py3nvml
${conda} clean -ya && ${pip} cache purge
}

function install_deps_cpu() {
echo "install for python $1"
pip=/opt/python/$1/bin/pip
${pip} config set global.index-url ${PYPI_URL}
${pip} install pip -U
${pip} install torch==1.6.0+cpu torchvision==0.7.0+cpu -f ${TORCH_URL}
${pip} install torch-scatter tensorflow==2.3.0 networkx==2.3 kazoo attrs
${pip} cache purge
ln -sf /opt/python/$1/bin/pip3 /usr/local/bin/pip3
ln -sf /usr/local/bin/python3.8 /usr/local/bin/python3
}

if [ ${INSTALL_CUDA} -eq 0 ];then
echo "install cpu version"
install_deps_cpu cp38-cp38
else
echo "install gpu version"
install_gcc
install_deps_gpu
fi

install_ssl
install_zk
setup_env
41 changes: 41 additions & 0 deletions docker/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/bin/bash
# Copyright 2020 JD.com, Inc. Galileo Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

set -e -u

version=${1:-1.0.0}
echo build galileo ${version}

# base
docker build -t jdgalileo/galileo:base-cpu -f base.Dockerfile \
--build-arg INSTALL_CUDA=0 \
--build-arg BASE_IMAGE=quay.io/pypa/manylinux2014_x86_64:latest .

docker build -t jdgalileo/galileo:base-gpu -f base.Dockerfile \
--build-arg INSTALL_CUDA=1 \
--build-arg BASE_IMAGE=nvidia/cuda:10.1-cudnn7-devel-centos7 .

# devel
docker build -t jdgalileo/galileo:devel-cpu -f devel.Dockerfile \
--build-arg TARGET=cpu .
docker build -t jdgalileo/galileo:devel-gpu -f devel.Dockerfile \
--build-arg TARGET=gpu .

# include galileo package
docker build -t jdgalileo/galileo:${version}-cpu -f galileo.Dockerfile \
--build-arg TARGET=cpu --build-arg VERSION=${version} .
docker build -t jdgalileo/galileo:${version}-gpu -f galileo.Dockerfile \
--build-arg TARGET=gpu --build-arg VERSION=${version} .
Loading

0 comments on commit 4f2e188

Please sign in to comment.