Skip to content

macOS build support (Apple clang, PG 16.9)#1774

Open
avamingli wants to merge 11 commits into
apache:mainfrom
avamingli:mac-port
Open

macOS build support (Apple clang, PG 16.9)#1774
avamingli wants to merge 11 commits into
apache:mainfrom
avamingli:mac-port

Conversation

@avamingli
Copy link
Copy Markdown
Contributor

Background

Apache Cloudberry today builds cleanly on Linux only. macOS developers
have to spin up a Linux VM or container to compile and run the
project locally, which slows the inner loop on day-to-day work (read
code, change a line, see it run).

This PR brings the macOS host (Intel Mac tested; Apple Silicon
should follow the same pattern) up to first-class buildable status:

  • ./configure succeeds with the full feature set
    (PAX, ic-proxy, ORCA, orafce, mapreduce, gpcloud, paxformat,
    tap-tests, plperl, plpython3, GSSAPI, PAM, LDAP, LZ4, ZSTD, …).
  • make && make install complete cleanly.
  • gpAux/gpdemo produces a 3-pair demo cluster with coordinator +
    standby + 3 primaries + 3 mirrors, all in sync.

Tested on macOS 13.x x86_64 with Xcode CommandLineTools (Apple clang
14.0.3) and Homebrew at /usr/local.

Strict guarantee: zero impact on existing Linux/gcc builds

Every change in this PR is gated so the existing Linux build path is
byte-for-byte unchanged. Specifically, each modified file falls
into one of these categories:

Mechanism Files
PORTNAME=darwin Makefile gate src/backend/Makefile, src/backend/gporca/gporca.mk, contrib/pax_storage/Makefile
if(APPLE) / NOT APPLE / CMAKE_SYSTEM_NAME cmake gate contrib/pax_storage/CMakeLists.txt, contrib/pax_storage/src/cpp/CMakeLists.txt, contrib/pax_storage/src/cpp/cmake/{pax,pax_format}.cmake
case $host_os in linux*) ;; autoconf gate configure.ac, configure (for liburing + Python probe)
src/template/darwin — loaded only on darwin (the DLSUFFIX pin)
#ifdef __APPLE__ / #ifdef __linux__ C preprocessor gate contrib/pax_storage/src/cpp/comm/fast_io.{h,cc}, contrib/pax_storage/src/cpp/storage/local_file_system.cc, contrib/pax_storage/src/cpp/storage/file_system.h
#if defined(__clang__) C++ gate (preserves gcc codegen) contrib/pax_storage/src/cpp/storage/columns/pax_delta_encoding.cc (VLA vs std::vector)
Identical compilation on Linux (zero functional change) src/include/utils/numeric.h (drop bogus const on non-pointer return type), contrib/pax_storage/src/cpp/storage/proto/proto_wrappers.h (defensive #undef/redef of PG macros around protobuf include), contrib/pax_storage/src/cpp/storage/proto/protobuf_stream.{h,cc} (google::protobuf::int64int64_t, equivalent typedef on older protobuf), contrib/pax_storage/src/cpp/storage/columns/pax_encoding_utils.{h,cc} (int64_t *int64 *, identical LP64 type on Linux), src/test/regress/GNUmakefile (.so$(DLSUFFIX), which expands to .so on Linux)

For every gated change the else / non-gated branch retains the
exact pre-PR code (verbatim). For every non-gated change the
generated machine code on Linux gcc is identical.

make installcheck on Linux should be entirely unaffected.

What the changes touch (high level)

Roughly four buckets, 9 atomic commits, 23 files, +280 / −67.

  1. Two universally-beneficial fixes for clang strictness that
    happen to unblock macOS:

    • Drop the gcc-only -Werror -Wextra -Wpedantic triple from
      gporca.mk on darwin only. clang reports far more
      diagnostics than gcc under these flags and breaks the build
      for no real safety benefit (per-feature -Werror= flags in
      the base CXXFLAGS still catch real bugs). Linux retains the
      original triple.
    • Fix a typo in init_var_from_str's prototype
      (extern const boolextern bool) that gcc silently
      tolerated but clang correctly flags as a conflicting type
      against the definition. Identical compilation on gcc;
      introduced by upstream PR
      Export numeric structure and interface to public #392 Export numeric interface to public.
  2. macOS link-time differences vs Linux:

    • macOS <sys/param.h> doesn't define HZ; provide a 100
      fallback so contrib/interconnect/udp/ic_udpifc.c compiles.
      #ifndef HZ makes this a no-op on Linux glibc.
    • macOS ld rejects unresolved symbols in shared libraries
      by default. With --enable-shared-postgres-backend, the
      libpostgres.so recipe deliberately omits main/main.o
      but other objects reference its symbols (progname, …).
      Add -Wl,-undefined,dynamic_lookup only when
      PORTNAME=darwin so those resolve at load time.
  3. PG 16 ↔ Cloudberry DLSUFFIX cohesion on darwin only.
    Upstream PG 16 commit b55f62abb2c "Unify DLSUFFIX on Darwin"
    changed macOS module suffix from .so to .dylib.
    Cloudberry's catalog SQL, cdb_init.d scripts, expected/*.out
    files still hard-code $libdir/foo.so. When PG sees an
    explicit .so it does NOT re-append DLSUFFIX, so catalog
    bootstrap breaks (FATAL: could not access file "foo").
    This PR pins DLSUFFIX=.so on darwin via src/template/darwin
    (file loaded only on darwin), plus two small follow-ons that
    no-op on Linux:

    • configure's Python-shared-library probe gets a .dylib
      fallback only when $PORTNAME = darwin (macOS libpython
      ships as .dylib regardless of module DLSUFFIX).
    • src/test/regress/GNUmakefile uses $(DLSUFFIX) instead
      of hardcoded .so for install/uninstall lines — on Linux
      $(DLSUFFIX) expands to .so, so this is a textual
      no-change after make-time substitution.
  4. PAX portability for non-Linux. Four focused commits:

    • liburing optional: configure keeps the hard error on
      Linux (case $host_os in linux*) AC_MSG_ERROR ;;) but
      allows missing on non-Linux. PAX has a pread-based
      SyncFastIO fallback selected at runtime by
      IOUringFastIO::available(). C++ code wraps the
      IOUringFastIO class with #ifdef __linux__. Linux gets
      identical behavior.
    • C++ portability: off64_t typedef
      (#ifdef __APPLE__-only), int64_tint64 signature
      alignment (identical type on Linux LP64 — same long int),
      defensive #undef Min/Max/IsPowerOf2 around protobuf
      includes (no-op on Linux: if old protobuf, no abseil
      shadowing; if new protobuf, the same fix is needed there
      too), VLA → std::vector only under clang via
      #if defined(__clang__) (gcc keeps zero-overhead stack
      VLA), google::protobuf::int64int64_t (typedef
      identical for older protobuf, required for v22+).
    • cmake portability: gcc-only flags moved behind
      if(APPLE)/else(); BUILD_GTEST=OFF only on darwin so
      Linux make pax-unit-test continues to work; -luuid only
      on Linux (libSystem provides it on macOS); on macOS only,
      build pax as a MODULE (Mach-O bundle) and link with
      -bundle_loader <postgres> -undefined dynamic_lookup so
      backend globals have a single shared instance between
      postgres and pax.so; abseil deps via pkg-config on macOS
      (Homebrew protobuf v22+ splits them); @loader_path
      instead of $ORIGIN on macOS. Linux uses the else
      branches in all cases — original behavior verbatim.
    • paxformat: the standalone reader. On macOS add
      -Wl,-undefined,dynamic_lookup to defer PG-backend symbol
      references; Linux build path unchanged.

Commits (kept atomic — order matters for builds)

Drop -Werror -Wextra -Wpedantic from ORCA C++ build
numeric: drop bogus 'const' qualifier on init_var_from_str
macOS: provide HZ fallback for UDP interconnect
macOS: defer libpostgres.so's undefined symbols to load time
macOS: pin DLSUFFIX=.so for PG 16 compatibility
macOS: make PAX's liburing dependency optional
macOS: PAX C++ portability
macOS: PAX cmake portability
macOS: build the standalone libpaxformat reader

How I verified

Fresh checkout of this branch, on macOS 13.x x86_64 / Apple clang
14.0.3, with these Homebrew packages installed:

bison flex pkg-config libxml2 zstd openssl@3 readline apr apr-util
libevent xerces-c icu4c gettext libyaml lz4 perl protobuf libuv krb5
./configure --prefix=$HOME/install/cbdb-mac --enable-debug --enable-cassert \
  --enable-pax --enable-ic-proxy --enable-orafce --enable-tap-tests \
  --enable-mapreduce --enable-gpcloud \
  --with-perl --with-python --with-libxml --with-openssl \
  --with-lz4 --with-pam --with-ldap --with-gssapi \
  --with-apr-config=$(brew --prefix apr)/bin/apr-1-config \
  --with-includes=<homebrew prefixes joined by ':'> \
  --with-libs=<homebrew prefixes joined by ':'>

make -j$(sysctl -n hw.ncpu) \
  CUSTOM_COPT="-Wno-error=uninitialized -Wno-error=gnu-variable-sized-type-not-at-end"
make install
ln -sf $PREFIX/cloudberry-env.sh $PREFIX/greenplum_path.sh

cd gpAux/gpdemo
NUM_PRIMARY_MIRROR_PAIRS=3 make create-demo-cluster

After which:

  • gp_segment_configuration shows 1 coordinator + 1 standby +
    3 primaries + 3 mirrors, all status='u' / mode='s'.
  • CREATE TABLE t USING pax → insert 1000 rows → SELECT count(*) returns 1000 (ORCA plan used).
  • orafce, plperl, plpython3u extensions install + work.
  • gpstate -s reports Mirror status = Synchronized for all
    three mirrors.

Linux build path was not re-verified in this PR (no Linux machine
handy), but is structurally unchanged — see "Strict guarantee"
section above. Welcome any CI run to confirm.

Caveats / out of scope

  • PAX's io_uring fast path stays Linux-only. macOS has no
    equivalent. On Linux: unchanged. On macOS: PAX falls back to
    the existing pread-based SyncFastIO.
  • uuid-ossp extension is not enabled in the configure
    command above because Homebrew e2fsprogs / ossp-uuid are
    heavy to pull in. Orthogonal to this PR.
  • GSSAPI on macOS requires Homebrew MIT krb5 in PATH /
    PKG_CONFIG_PATH; macOS's system Heimdal lacks
    gss_store_cred_into (used by PG 16). Environment setup
    detail, no code change in this PR.

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@avamingli avamingli changed the title macOS build support (Apple clang + Homebrew, PG 16.9 macOS build support (Apple clang, PG 16.9) May 29, 2026
@avamingli avamingli requested a review from my-ship-it May 29, 2026 04:03
avamingli added 11 commits May 29, 2026 18:31
liburing is the Linux-only userspace API for the io_uring kernel
interface (Linux 5.1+). It is not available on macOS or *BSD.

PAX already has a sync fallback path (SyncFastIO using pread(2)) and
LocalFile::ReadBatch picks between the two via IOUringFastIO::available().
This commit lets the build proceed without liburing:

  * configure: AC_CHECK_LIB(uring) no longer aborts when missing; PAX
    falls back to SyncFastIO at runtime.
  * fast_io.h / fast_io.cc: wrap the IOUringFastIO class declaration
    and methods with #ifdef __linux__ so they only exist where the
    header is available.
  * local_file_system.cc: gate the IOUringFastIO::available() branch
    with #ifdef __linux__; non-Linux unconditionally uses SyncFastIO.
ic_udpifc.c uses the HZ macro (kernel timer frequency) for RTO
calculations. On Linux glibc, HZ is exposed via <sys/param.h>
(typically 100). macOS's <sys/param.h> does not define HZ.

Define HZ=100 as a fallback when the platform headers don't provide
it. The exact value only affects retransmit pacing constants
(UDP_RTO_MIN, TIME_TICK); 100 matches the Linux default.
src/backend/gporca/gporca.mk forced -Werror -Wextra -Wpedantic on top
of the project CXXFLAGS via 'override CXXFLAGS := ...'. clang reports
many more warnings than gcc under -Wextra/-Wpedantic, which then
become errors and break the macOS build (unused-but-set-variable,
inconsistent-missing-override, ...).

Keep -fno-omit-frame-pointer (used by ORCA's backtracing); drop the
strict warning flags. The base project CXXFLAGS still includes -Wall
and per-feature -Werror= flags.
With --enable-shared-postgres-backend (default) the libpostgres.so
recipe filters out main/main.o, but other backend objects still
reference symbols defined in main.o (progname, etc.). On Linux the
default linker behaviour permits undefined references in a shared
library; on macOS, ld -dynamiclib rejects them with 'Undefined
symbols ... _progname'.

Pass -Wl,-undefined,dynamic_lookup only when PORTNAME=darwin so those
symbols resolve at load time against the postgres executable that
loads libpostgres.so. The Linux behaviour is unchanged.
Several gcc / GNU-ld specific bits in the PAX cmake setup break under
Apple clang and Apple ld. Make them conditional on Linux.

  * CMakeLists.txt: gate -no-pie / -Wl,--allow-multiple-definition /
    -fno-access-control / -Wno-pmf-conversions (gcc-only) behind
    if(NOT APPLE). Replace gcc-only warning disables (-Wno-clobbered,
    -Wno-sized-deallocation, -Wno-parameter-name) with
    -Wno-unknown-warning-option on APPLE, plus a set of
    -Wno-error= demotions for clang-stricter diagnostics
    (inconsistent-missing-override, overloaded-virtual,
    sometimes-uninitialized, unused-private-field, format,
    mismatched-tags, pessimizing-move, unused-but-set-variable,
    deprecated-copy, unused-result).
  * Makefile: pass -DBUILD_GTEST=OFF to the inner cmake (googletest's
    own flags clash with clang). Drop SHLIB_LINK += -luuid on darwin
    (uuid_* is in libSystem). Move the ifneq to after the include of
    Makefile.global so PORTNAME is actually defined. Cope with cmake
    naming the artefact libpax.so on every platform now (see below).
  * src/cpp/CMakeLists.txt: skip the standalone libpaxformat target on
    macOS. It links to backend symbols (write_stderr, ...) directly
    and isn't needed to load PAX inside postgres.
  * pax.cmake: on APPLE, build pax as a MODULE (Mach-O bundle, what
    PG extensions are) instead of SHARED, and link with
    -Wl,-undefined,dynamic_lookup -Wl,-bundle_loader,<postgres>. This
    is the standard PG extension pattern; it guarantees that backend
    globals (e.g. process_shared_preload_libraries_in_progress) have
    one shared instance between postgres and pax.so. Also gate -luring
    on Linux; pull abseil deps via pkg-config on macOS (Homebrew's
    protobuf v22+ split them into separate libs); replace the
    Linux-only $ORIGIN INSTALL_RPATH and -Wl,--enable-new-dtags with
    @loader_path on macOS.
  * pax_format.cmake: same Linux-only treatment for -luuid / -luring
    and the abseil pkg-config.
Compile-time fixes to let PAX build with Apple clang against
Homebrew's modern protobuf (v22+).

  * file_system.h: typedef off64_t = off_t on macOS. glibc exposes
    off64_t for 32-bit programs opting into 64-bit file offsets;
    macOS's off_t is already 64-bit so there is no separate symbol.
  * pax_encoding_utils.{h,cc}: change BuildHistogram/ZigZagBuffers
    signatures from int64_t* to PG's int64* (long). On macOS x86_64
    int64_t is 'long long', distinct from 'long' for overload
    resolution even though both are 64-bit. Using PG's int64
    (consistently 'long' on every supported port) keeps callers from
    PG-typed buffers (e.g. DataBuffer<int64>::StartT) working on both
    platforms.
  * pax_delta_encoding.cc: replace 'uint8_t bit_widths[var] = {0};'
    (a gcc VLA-with-initializer extension that clang rejects) with
    std::vector<uint8_t>.
  * proto_wrappers.h: PG's c.h defines Min/Max macros and
    xlog_internal.h defines IsPowerOf2; abseil (pulled in by modern
    protobuf headers) declares identifiers with the same names.
    #undef them around the protobuf include, then restore PG's
    definitions afterwards.
  * protobuf_stream.{h,cc}: protobuf v22 removed the
    google::protobuf::int64 typedef. Use int64_t directly (which is
    the type of the base ZeroCopy{Output,Input}Stream::ByteCount()
    override anyway).
paxformat is the standalone PAX file reader meant to be linked by
external tools. The previous commit 'macOS: PAX cmake portability'
skipped it on APPLE because it references PG backend functions
(write_stderr, xlog_check_consistency_hook, ...) that have no
libpostgres.so to satisfy them on macOS.

Build it after all by deferring those undefined symbols to load time
with -Wl,-undefined,dynamic_lookup, just like Linux's default ld
behaviour for shared libraries. The smoke-test executable
paxformat_test uses the same flag and drops the explicit 'postgres'
link library (there is no libpostgres.so to link against on macOS).

paxformat is a regular dylib (SHARED) here — not a bundle — because
the test executable links against it at link time.
The prototype in src/include/utils/numeric.h declared the function as
returning 'const bool', which conflicts with the definition in
numeric.c that returns plain 'bool'. gcc tolerates this (const on a
non-pointer return is silently meaningless), but clang flags it as
'conflicting types' and the build fails on macOS.

Introduced by 'apache#392 Export numeric interface to public'.

Match the .c definition: plain 'bool'.
PG 16 upstream commit b55f62a ('Unify DLSUFFIX on Darwin') changed
DLSUFFIX on macOS from .so to .dylib so the suffix would match both
linkable shared libraries and dlopen'd modules. Cloudberry, however,
has many places that still hard-code '$libdir/foo.so' — the catalog
SQL bootstrap scripts, cdb_init.d, and a number of expected/*.out
files. When PG sees an explicit '.so' suffix in a library reference
it does NOT re-append DLSUFFIX, so a .so / .dylib divergence breaks
the catalog bootstrap (FATAL: could not access file 'foo').

Restore the pre-PG16 behaviour of DLSUFFIX=.so on darwin (via
src/template/darwin) so all those hard-coded references resolve.

Two follow-on adjustments are needed:

  * configure: the Python-shared-library probe builds a candidate path
    as '$python_libdir/lib$ldlibrary$DLSUFFIX'. macOS Python ships
    its shared lib as .dylib regardless of what DLSUFFIX is set to for
    modules; without a .dylib fallback the probe fails with
    'could not find shared library for Python'. Try .dylib alongside
    DLSUFFIX on darwin.

  * src/test/regress/GNUmakefile: a handful of install/uninstall
    lines hard-coded '.so' for the test-helper modules
    (regress, test_hook, query_info_hook_test). Replace with
    $(DLSUFFIX) so they keep working regardless of the value.
After this commit, plain 'make' (without any CUSTOM_COPT= on the
command line) builds cleanly on macOS with Apple clang. Previously
users had to remember a long override.

The warning categories that needed demoting (darwin only):

  -Wuninitialized           — clang flags spots upstream gcc accepts.
  -Wgnu-variable-sized-type-not-at-end
                            — clang-only; fires on PG catalog headers
                              like pg_task.h with inline struct-plus-
                              trailing-text declarations.
  -Wunused-function         — clang flags static functions never
                              referenced in the TU (a few exist in
                              currently-disabled code paths, e.g.
                              ic_udpifc.c).
  -Wdeprecated-non-prototype
                            — clang-only; flags K&R-style 'foo()'
                              forward declarations. Where the
                              mismatch is a real bug we fix it
                              inline; this demotion covers the rest.

The block is gated to PORTNAME=darwin so the Linux gcc build path
is unchanged: -Werror remains in effect for all categories there.
The forward declaration at line 838 was

    static void initSndBufferPool();

which under C's old K&R rules means 'function with unspecified
arguments' — but the actual definition takes a SendBufferPool *:

    static void
    initSndBufferPool(SendBufferPool *p)

and the single caller at line 3677 passes &snd_buffer_pool. gcc
silently accepts the mismatch; Apple clang correctly rejects it
under -Wdeprecated-non-prototype. C2x will reject it on every
compiler. Match the forward declaration to the definition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants