Tea is an open-source extension for Greenplum Database that allows it to access Apache Iceberg™ data from S3-compatible storage written in Apache Parquet format. Tea adopts PXF logic to standard PostgreSQL/Greenplum interfaces such as Foreign data and External tables
Greenplum 5 is based on PostgreSQL that has no FOREIGN TABLE support so you can use Tea with EXTERNAL TABLE only for them.
Greenplum 6 has restricted support of FOREIGN TABLE. Especially you cannot use GP Orca with them.
EXTERNAL TABLE has no separate qusery coordinator. Such tables require special tasks-coordination logic enabled in config (we call it samovar) and separate Redis or Valkey installation its work.
Samovar is also used for work-stealing between Greenplum segments.
Extract Tea from tea-{platform}-{version}.tar.gz into Greenplum home directory.
export GPHOME=/path/to/greenplum
tar xzf tea-linux-1.70.0.tar.gz --strip-components=1 -C $GPHOME
Edit $GPHOME/tea/tea-config.json. At least you have to set access and secret keys to object storage and address for Iceberg catalog.
{
"common": {
"s3": {
"access_key": "minioadmin",
"secret_key": "minioadmin",
"endpoint_override": "127.0.0.1:9000",
"scheme": "http"
},
"catalog": {
"type" : "hms",
"hms": "127.0.0.1:9083",
"rest": "127.0.0.1:19120"
}
}
}
You are able to set additional profiles sections that override common settings.
More configuration fields you can find in tea-config.json and tea-config-schema.json.
To access data from Apache Iceberg you should register tea extention and create EXTERNAL TABLE or FOREIGN TABLE to every Iceberg table you want to read.
CREATE EXTENSION tea;
CREATE READABLE EXTERNAL TABLE table_name (...)
LOCATION ('tea://iceberg_namespace.iceberg_table')
FORMAT 'custom' (formatter = tea_import);
It creates an EXTERNAL TABLE linked to an Iceberg table iceberg_namespace.iceberg_table declared in Iceberg catalog.
Created table is accessible for reading.
CREATE EXTENSION tea;
CREATE SERVER tea_server FOREIGN DATA WRAPPER tea_fdw;
CREATE FOREIGN TABLE table_name (...)
SERVER tea_server
OPTIONS(location 'tea://iceberg_namespace.iceberg_table');
It creates a FOREIGN TABLE linked to an Iceberg table iceberg_namespace.iceberg_table declared in Iceberg catalog.
Created table is accessible for reading.
Extract a new version
export GPHOME=/path/to/greenplum
tar xzf tea-linux-1.71.0.tar.gz --strip-components=1 -C $GPHOME
Update extension in PostgreSQL's way
ALTER EXTENSION tea UPDATE TO 1.71.0;
SELECT installed_version FROM pg_available_extensions WHERE name='tea';
Downgrade extension if needed
ALTER EXTENSION tea UPDATE TO 1.70.0;
Install GCC 13
sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test
sudo apt-get install gcc-13 g++-13
Create a directory for source and binaries
mkdir $HOME/compile
COMPILE_DIR=$HOME/compile
Get CMake 3 (>=3.25)
cd $COMPILE_DIR
wget https://github.com/Kitware/CMake/releases/download/v3.31.11/cmake-3.31.11-linux-x86_64.tar.gz
tar xf cmake-3.31.11-linux-x86_64.tar.gz
PATH=$COMPILE_DIR/cmake-3.31.11-linux-x86_64/bin:$PATH
Get TEA source
git clone https://github.com/lithium-tech/tea -b OPENGPDB_GP6
Build and install Arrow
git clone https://github.com/apache/arrow.git -b maint-15.0.2
cd arrow
git apply $COMPILE_DIR/tea/vendor/arrow/fix_c-ares_url.patch
./cpp/thirdparty/download_dependencies.sh $COMPILE_DIR/arrow-thirdparty
mkdir cpp/build
cd $_
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=$COMPILE_DIR/bin \
-DCMAKE_C_COMPILER=gcc-13 -DCMAKE_CXX_COMPILER=g++-13 \
-DARROW_BUILD_STATIC=ON -DARROW_BUILD_SHARED=OFF \
-DARROW_DEPENDENCY_SOURCE=BUNDLED -DARROW_NO_DEPRECTATED_API=ON \
-DARROW_LLVM_USE_SHARED=OFF -DARROW_FILESYSTEM=ON -DARROW_PARQUET=ON \
-DARROW_S3=ON -DARROW_WITH_SNAPPY=ON -DARROW_WITH_LZ4=ON \
-DARROW_WITH_ZLIB=ON -DARROW_WITH_ZSTD=ON -DARROW_IPC=ON -DARROW_CSV=ON \
-DARROW_WITH_RAPIDJSON=ON -DARROW_GANDIVA=ON -DARROW_COMPUTE=ON ..
make -j`nproc`
make install
Build and install gRPC
cd $COMPILE_DIR
git clone https://github.com/grpc/grpc.git -b v1.62.3
cd grpc
git submodule update --init --single-branch --depth 1
mkdir build
cd $_
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=$COMPILE_DIR/bin \
-DCMAKE_C_COMPILER=gcc-13 -DCMAKE_CXX_COMPILER=g++-13 \
-DgRPC_BUILD_SHARED_LIBS=OFF -DgRPC_BUILD_STATIC_LIBS=ON \
-DgRPC_BUILD_TESTS=OFF -DgRPC_BUILD_EXAMPLES=OFF \
-DgRPC_BUILD_CSHARP_EXT=OFF -DgRPC_BUILD_GRPC_CSHARP_PLUGIN=OFF \
-DgRPC_BUILD_GRPC_NODE_PLUGIN=OFF -DgRPC_BUILD_GRPC_OBJECTIVE_C_PLUGIN=OFF \
-DgRPC_BUILD_GRPC_PHP_PLUGIN=OFF -DgRPC_BUILD_GRPC_PYTHON_PLUGIN=OFF \
-DgRPC_BUILD_GRPC_RUBY_PLUGIN=OFF -DgRPC_SSL_PROVIDER:STRING=package ..
make -j`nproc`
make install
Build TEA. Replace $COMPILE_DIR/gpdb_bin with your OpenGPDB root.
cd $COMPILE_DIR/tea
mkdir -p build/arrow-thirdparty
cd build
cp $COMPILE_DIR/arrow-thirdparty/* arrow-thirdparty/
cmake .. -GNinja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_PREFIX_PATH=$COMPILE_DIR/bin \
-DCMAKE_C_COMPILER=gcc-13 -DCMAKE_CXX_COMPILER=g++-13 \
-DGreenplum_ROOT=$COMPILE_DIR/gpdb_bin/
ninja
ninja install