Skip to content

[Bug] gpbackman: --history-db should fall back to $COORDINATOR_DATA_DIRECTORY, and OpenHistoryDB should not silently create empty files #96

@talmacschen-arch

Description

@talmacschen-arch

Apache Cloudberry and cloudberry-backup version

cloudberry 2.0.x & 2.1.x
cloudberry-backup main

What happened

Summary

gpbackman makes two reasonable defaults harder than they need to be:

  1. The --history-db flag has no fallback to the standard
    $COORDINATOR_DATA_DIRECTORY / $MASTER_DATA_DIRECTORY env vars
    that the Cloudberry/Greenplum environment scripts already set.
  2. When the resolved path does not exist, SQLite silently creates an
    empty database, and the next query fails with the misleading
    no such table: backups. An empty gpbackup_history.db is also
    left behind in cwd, polluting unrelated working directories.

Reproduce on main

$ which gpbackman                                         
/usr/local/cloudberry/bin/gpbackman

$ cd /tmp                       # any directory without a history DB
$ gpbackman backup-info                                                                                                                                                    
[ERROR]:-Unable to read data from history db. Error: no such table: backups
                                                                                                                                                                           
$ ls /tmp/gpbackup_history.db                             
-rw-r--r-- 1 gpadmin gpadmin 0 Apr 29 12:40 /tmp/gpbackup_history.db                                                                                                       
                              ^^^ silently created, polluting cwd                                                                                                          
                                                                                                                                                                           
Expected behaviour                                                                                                                                                         
                                                                                                                                                                           
- When --history-db is omitted, gpbackman should look up                                                                                                                   
$COORDINATOR_DATA_DIRECTORY/gpbackup_history.db (and then
$MASTER_DATA_DIRECTORY/gpbackup_history.db for older installs)                                                                                                             
before falling back to cwd. This matches how operators already                                                                                                             
source the cluster environment.                                                                                                                                            
- When the resolved file does not exist, gpbackman should fail loud                                                                                                        
with a clear, actionable error and not create an empty SQLite                                                                                                              
database on disk.                                                                                                                                                          
                                                          
Why this matters                                                                                                                                                           
                                                          
- Operators routinely run gpbackman from ~ after sourcing                                                                                                                  
greenplum_path.sh / equivalent. Today they must always type the
full coordinator path, which is fragile across cluster layouts.                                                                                                            
- The "silent create + cryptic later error" UX has bitten me during                                                                                                        
exploration; the failure mode is non-local and easy to misdiagnose                                                                                                         
as a corrupt history database.                                                                                                                                             
                                                                                                                                                                           


### What you think should happen instead

Proposed approach                                                                                                                                                          
                                                                                                                                                                           
- Resolve --history-db empty → $COORDINATOR_DATA_DIRECTORY →                                                                                                               
$MASTER_DATA_DIRECTORY → bare filename in cwd (last resort,
preserves existing behaviour).                                                                                                                                             
- In OpenHistoryDB, pre-check with os.Stat and open SQLite via the
file:<path>?mode=rw URI (read+write but never create).                                                                                                                     
- Friendly error message that names the missing path and points at the
flag and env vars.                                                                                                                                                         
                                                                                                                                                                           


### Operating System

rockylinux 9.x / RHEL 9.X

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions