Skip to content

MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER#5140

Open
Thirunarayanan wants to merge 43 commits into
MDEV-14992from
MDEV-39061
Open

MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER#5140
Thirunarayanan wants to merge 43 commits into
MDEV-14992from
MDEV-39061

Conversation

@Thirunarayanan
Copy link
Copy Markdown
Member

scripts/mariabackup/mariabackup.sh: a drop-in wrapper that
lets existing mariabackup --backup invocations drive the server-side
BACKUP SERVER command without changing user scripts.

mariabackup.sh translates
--backup into "BACKUP SERVER TO '<dir>'" via the mariadb client,
forwarding connection options, and layers
--stream/--compress as tar/gzip pipelines on the result. It
validates the target directory up front and rejects the

- mbstream.sh shims the mbstream CLI onto tar, dropping
mbstream-only flags (-p/--parallel) so legacy pipelines keep working.
- README.md maps every supported --backup option to its BACKUP SERVER
equivalent and documents current limitations.

Add include/have_mariabackup_wrapper.inc redirects
$XTRABACKUP to the wrapper so a test opts in by sourcing one
file; skips when the wrapper, bash, or the mariadb client
is unavailable.

wrapper_basic.test : Exercises full backup, streaming, compression,
the ignored legacy options

dr-m added 30 commits March 17, 2026 12:45
The InnoDB write-ahead log file in the old innodb_log_archive=OFF
format is named ib_logfile0, pre-allocated to innodb_log_file_size and
written as a ring buffer. This is good for write performance and space
management, but unsuitable for arbitrary point-in-time recovery or for
facilitating efficient incremental backup.

innodb_log_archive=ON: A new format where InnoDB will create and
preallocate files ib_%016x.log, instead of writing a circular file
ib_logfile0. Each file will be pre-allocated to innodb_log_file_size
(between 4M and 4G; we impose a stricter upper limit of 4 GiB for
innodb_log_archive=ON). Once a log fills up, we will create and
pre-allocate another log file, to which log records will be written.
Upon the completion of the first log checkpoint in a recently created
log file, the old log file will be marked read-only, signaling that
there will be no further writes to that file, and that the file may
safely be moved to long-term storage.

The file name includes the log sequence number (LSN) at file offset
12288 (log_t::START_OFFSET). Limiting the file size to 4 GiB allows us
to identify each checkpoint by storing a 32-bit big-endian offset into
the optional FILE_MODIFY and the mandatory FILE_CHECKPOINT records,
between 12288 and the end of the file.

The innodb_encrypt_log format is identified by storing the encryption
information at the start of the log file. The first 32-bit value will
be 1, which is an invalid checkpoint offset. Each
innodb_log_archive=ON log must use the same encryption parameters.
Changing innodb_encrypt_log or related parameters is only possible by
setting innodb_log_archive=OFF and restarting the server, which will
permanently lose the history of the archived log.

The maximum number of log checkpoints that the innodb_log_archive=ON
file header can represent is limited to 12288/4=3072 when using
innodb_encrypt_log=OFF. If we run out of slots in a log file, each
subsequently completed checkpoint in that log file will overwrite the
last slot in the checkpoint header, until we switch to the next log.

innodb_log_recovery_start: The checkpoint LSN to start recovery from.
This will be useful when recovering from an archived log. This is useful
for restoring an incremental backup (applying InnoDB log files that were
copied since the previous restore).

innodb_log_recovery_target: The requested LSN to end recovery at.
When this is set, all persistent InnoDB tables will be read-only, and
no writes to the log are allowed. The intended purpose of this setting
is to prepare an incremental backup, as well as to allow data
retrieval as of a particular logical point of time.

Setting innodb_log_recovery_target>0 is much like setting
innodb_read_only=ON, with the exception that the data files may be
written to by crash recovery, and locking reads will conflict with any
incomplete transactions as necessary, and all transaction isolation
levels will work normally (not hard-wired to READ UNCOMMITTED).

srv_read_only_mode: When this is set (innodb_read_only=ON), also
recv_sys.rpo (innodb_log_recovery_target) will be set to the current LSN.
This ensures that it will suffice to check only one of these variables
when blocking writes to persistent tables.

The status variable innodb_lsn_archived will reflect the LSN
since when a complete InnoDB log archive is available. Its initial
value will be that of the new parameter innodb_log_archive_start.
If that variable is 0 (the default), the innodb_lsn_archived will
be recovered from the available log files. If innodb_log_archive=OFF,
innodb_lsn_archived will be adjusted to the latest checkpoint every
time a log checkpoint is executed. If innodb_log_archive=ON, the value
should not change.

SET GLOBAL innodb_log_archive=!@@GLOBAL.innodb_log_archive will take
effect as soon as possible, possibly after a log checkpoint has been
completed. The log file will be renamed between ib_logfile0 and
ib_%016x.log as appropriate.

When innodb_log_archive=ON, the setting SET GLOBAL innodb_log_file_size
will affect subsequently created log files when the file that is being
currently written is running out. If we are switching log files exactly
at the same time, then a somewhat misleading error message
"innodb_log_file_size change is already in progress" will be issued.

no_checkpoint_prepare.inc: A new file, to prepare for subsequent
inclusion of no_checkpoint_end.inc. We will invoke the server to
parse the log and to determine the latest checkpoint.

All --suite=encryption tests that use innodb_encrypt_log
will be skipped for innodb_log_archive=ON, because enabling
or disabling encryption on the log is not possible without
temporarily setting innodb_log_archive=OFF and restarting
the server. The idea is to add the following arguments to an
invocation of mysql-test/mtr:

--mysqld=--loose-innodb-log-archive \
--mysqld=--loose-innodb-log-recovery-start=12288 \
--mysqld=--loose-innodb-log-file-mmap=OFF \
--skip-test=mariabackup

Alternatively, specify --mysqld=--loose-innodb-log-file-mmap=ON
to cover both code paths.

The mariabackup test suite must be skipped when using the
innodb_log_archive=ON format, because mariadb-backup will only
support the old ib_logfile0 format (innodb_log_archive=OFF).

A number of tests would fail when the parameter
innodb_log_recovery_start=12288 is present, which is forcing
recovery to start from the beginning of the history
(the database creation). The affected tests have been adjusted with
explicit --innodb-log-recovery-start=0 to override that:

(0) Some injected corruption may be "healed" by replaying the log
from the beginning. Some tests expect an empty buffer pool after
a restart, with no page I/O due to crash recovery.
(1) Any test that sets innodb_read_only=ON would fail with an error
message that the setting prevents crash recovery, unless
innodb_log_recovery_start=0.
(2) Any test that changes innodb_undo_tablespaces would fail in crash
recovery, because crash recovery assumes that the undo tablespace ID
that is available from the undo* files corresponds with the start of
the log. This is an unforunate design bug which we cannot fix easily.

log_sys.first_lsn: The start of the current log file, to be consulted
in log_t::write_checkpoint() when renaming files.

log_sys.archived_lsn: New field: The value of innodb_lsn_archived.

log_sys.end_lsn: New field: The log_sys.get_lsn() when the latest
checkpoint was initiated. That is, the start LSN of a possibly empty
sequence of FILE_MODIFY records followed by FILE_CHECKPOINT.

log_sys.resize_target: The value of innodb_log_file_size that will be
used for creating the next archive log file once the current file (of
log_sys.file_size) fills up.

log_sys.archive: New field: The value of innodb_log_archive.

log_sys.next_checkpoint_no: Widen to uint16_t. There may be up to
12288/4=3072 checkpoints in the header.

log_sys.log: If innodb_log_archive=ON, this file handle will be kept
open also in the PMEM code path.

log_sys.resize_log: If innodb_log_archive=ON, we may have two log
files open both during normal operation and when parsing the log. This
will store the other handle (old or new file).

log_sys.resize_buf: In the memory-mapped code path, this will point
to the file resize_log when innodb_log_archive=ON.

recv_sys.log_archive: All innodb_log_archive=ON files that will be
considered in recovery.

recv_sys.was_archive: A flag indicating that an innodb_log_archive=ON
file is in innodb_log_archive=OFF format.

log_sys.is_pmem, log_t::is_mmap_writeable(): A new predicate.
If is_mmap_writeable(), we assert and guarantee buf_size == capacity().

log_t::archive_new_write(): Create and allocate a new log file, and
write the outstanding data to both the current and the new file, or
only to the new file, until write_checkpoint() completes the first
checkpoint in the new file.

log_t::archived_mmap_switch_prepare(): Create and memory-map a new log
file, and update file_size to resize_target. Remember the file handle
of the current log in resize_log, so that write_checkpoint() will be
able to make it read-only.

log_t::archived_mmap_switch_complete(): Switch to the buffer that was
created in archived_mmap_switch_prepare().

log_t::write_checkpoint(): Allow an old checkpoint to be completed in
the old log file even after a new one has been created. If we are
writing the first checkpoint in a new log file, we will mark the old
log file read-only. We will also update log_sys.first_lsn unless it
was already updated in ARCHIVED_MMAP code path. In that code path,
there is the special case where log_sys.resize_buf == nullptr and
log_sys.checkpoint_buf points to log_sys.resize_log (the old log file
that is about to be made read-only). In this case, log_sys.first_lsn
will already point to the start of the current log_sys.log, even
though the switch has not been fully completed yet.

log_t::header_rewrite(my_bool): Rewrite the log file header before or
after renaming the log file, and write a message about the change,
so that there will be a chance to recover in case the server is being
killed during this operation.  The recovery of the last ib_%016%.log
does tolerate also the ib_logfile0 format.

log_t::set_archive(my_bool,THD): Implement SET GLOBAL innodb_log_archive.
An error will be returned if non-archived SET GLOBAL innodb_log_file_size
(log file resizing) is in progress. Wait for checkpoint if necessary.
The current log file will be renamed to either ib_logfile0 or
ib_%016x.log, as appropriate.

log_t::archive_rename(): Rename an archived log to ib_logfile0 on recovery
in case there had been a crash during set_archive().

log_t::archive_set_size(): A new function, to ensure that
log_sys.resize_target is set on startup.

log_checkpoint_low(): Do not prevent a checkpoint at the start of a file.
We want the first innodb_log_archive=ON file to start with a checkpoint.

log_t::create(lsn_t): Initialize last_checkpoint_lsn. Initialize the
log header as specified by log_sys.archive (innodb_log_archive).

log_write_buf(): Add the parameter max_length, the file wrap limit.

log_write_up_to(), mtr_t::commit_log_release<bool mmap=true>():
If we are switching log files, invoke buf_flush_ahead(lsn, true)
to ensure that a log checkpoint will be completed in the new file.

mtr_t::finish_writer(): Specialize for innodb_log_archive=ON.

mtr_t::commit_file(): Ensure that log archive rotation will complete.

log_t::append_prepare<log_t::ARCHIVED_MMAP>(): Special case.

log_t::get_path(): Get the name of the current log file.

log_t::get_circular_path(size_t): Get the path name of a circular file.
Replaces get_log_file_path().

log_t::get_archive_path(lsn_t): Return a name of an archived log file.

log_t::get_next_archive_path(): Return the name of the next archived log.

log_t::append_archive_name(): Append the archive log file name
to a path string.

mtr_t::finish_writer(): Invoke log_close() only if innodb_log_archive=OFF.
In the innodb_log_archive=ON, we only force log checkpoints after creating
a new archive file, to ensure that the first checkpoint will be written
as soon as possible.

log_t::checkpoint_margin(): Replaces log_checkpoint_margin().
If a new archived log file has been created, wait for the
first checkpoint in that file.

srv_log_rebuild_if_needed(): Never rebuild if innodb_log_archive=ON.
The setting innodb_log_file_size will affect the creation of
subsequent log files. The parameter innodb_encrypt_log cannot be
changed while the log is in the innodb_log_archive=ON format.

log_t::attach(), log_mmap(): Add the parameter log_access,
to distinguish memory-mapped or read-only access.

log_t::attach(): When disabling innodb_log_file_mmap, read
checkpoint_buf from the last innodb_log_archive=ON file.

log_t::clear_mmap(): Clear the tail of the checkpoint buffer
if is_mmap_writeable().

log_t::set_recovered(): Invoke clear_mmap(), and restore the
log buffer to the correct position.

recv_sys_t::apply(): Let log_t::clear_mmap() enable log writes.

recv_sys_t::find_checkpoint(): Find and remember the checkpoint position
in the last file when innodb_log_recovery_start points to an older file.
When innodb_log_file_mmap=OFF, restore log_sys.checkpoint_buf from
the latest log file. If the last archive log file is actually
in innodb_log_archive=OFF format despite being named ib_%016.log,
try to recover it in that format. If the circular ib_logfile0 is missing,
determine the oldest archived log file with contiguous LSN.
If innodb_log_archive=ON, refuse to start if ib_logfile0 exists.
Open non-last archived log files in read-only mode.

recv_sys_t::find_checkpoint_archived(): Validate each checkpoint in
the current file header, and by default aim to recover from the last
valid one. Terminate the search if the last validated checkpoint
spanned two files. If innodb_log_recovery_start has been specified,
attempt to validate it even if there is no such information stored
in the checkpoint header.

log_parse_file(): Do not invoke fil_name_process() during
recv_sys_t::find_checkpoint_archived(), when we tolerate FILE_MODIFY
records while looking for a FILE_CHECKPOINT record.

recv_scan_log(): Invoke log_t::archived_switch_recovery() upon
reaching the end of the current archived log file.

log_t::archived_switch_recovery_prepare(): Make use of
recv_sys.log_archive and open all but the last file read-only.

log_t::archived_switch_recovery(): Switch files in the pread() code path.

log_t::archived_mmap_switch_recovery_complete(): Switch files in the
memory-mapped code path.

recv_warp: A pointer wrapper for memory-mapped parsing that spans two
archive log files.

recv_sys_t::parse_mmap(): Use recv_warp for innodb_log_archive=ON.

recv_sys_t::parse(): Tweak some logic for innodb_log_archive=ON.

log_t::set_recovered_checkpoint(): Set the checkpoint on recovery.
Updates also the end_lsn.

log_t::set_recovered_lsn(): Also update flush_lock and write_lock,
to ensure that log_write_up_to() will be a no-op.

log_t::persist(): Even if the flushed_to_disk_lsn does not change,
we may want to reset the write_lsn_offset.
innodb.log_archive: Mask the innodb_lsn_flushed, because just like
the current LSN which we already masked, it may differ by the size
of a FILE_CHECKPOINT record (16 bytes).
This fixes up 076a99e (MDEV-37949).
In the memory-mapped log writing code path we were wrongly assuming
that innodb_log_file_size cannot exceed 4GiB. This assumption only
holds for innodb_log_archive=ON.
Now that we removed the assignments to log_sys.buf_size
we must refer to capacity(). The expression was carefully
written in this way so that GCC 16 -O2 would produce more compact code.
Remove all traces of the buf_size= capacity() assignments
log_t::archive_create(bool): Create and allocate an archive log.

log_t::write_checkpoint(): Try to preallocate the next archive file
if needed, with the goal that when we need the file it will already
be ready for use.

FIXME: Adjust crash recovery so that it will tolerate the extra empty
files. We are missing calls to log_sys.unstash_archive_file() when the
last file is unusable for recovery (filled with zeroes).
log_t::archive_create(): Tolerate a larger than zero-sized file.

log_t::set_recovered_lsn(): Invoke unstash_archive_file() in case there was
a garbage (pre-allocated) file at the end which was not parsed at all.

log_file_is_zero(): Check if a log file starts with NUL bytes
(is a preallocated file).

recv_sys_t::find_checkpoint(): Open the last non-preallocated log file in
read/write mode.

recv_sys_t::archive_map: Make the elements const.
This introduces a basic driver Sql_cmd_backup, storage engine interfaces,
and basic copying of InnoDB data files.
On Windows, we pass a target directory name; elsewhere, we pass a
target directory handle.

fil_space_t::write_or_backup: Keep track of in-flight page writes and
pending backup operation. We must not allow them concurrently, because
that could lead into torn pages in the backup.

fil_space_t::backup_end: The first page number that is not being backed up
(by default 0, to indicate that no backup is in progress).

log_t::backup: Whether BACKUP SERVER is in progress. The purpose of this
is to make BACKUP SERVER prevent the concurrent execution of
SET GLOBAL innodb_log_archive=OFF or SET GLOBAL innodb_log_file_size
when innodb_log_archive=OFF.

log_sys.archived_checkpoint: Keep track of the earliest available
checkpoint, corresponding to log_sys.archived_lsn. This reflects
SET GLOBAL innodb_log_recovery_start (which is settable now), for
incremental backup.

buf_flush_list_space(): Check for concurrent backup before writing each
page. This is inefficient, but this function may be invoked from multiple
threads concurrently, and it cannot be changed easily, especially for
fil_crypt_thread().

TODO: Implement finer-grained locking around copying page ranges.

TODO: Implement other storage engine interfaces.

TODO: Implement the necessary locking around backup_end.

TODO: Fix the space.get_create_lsn() < checkpoint logic.
TODO: Duplicate the last log file at the end

innodb_backup_checkpoint(): Invoked when log checkpoint is switching
to a new file.
Fix a race in checkpoint_complete().

Fix the FreeBSD build.

Try to catch the log I/O error on Windows.
Release write fix when skipping a write
log_t::set_archive(): In SET GLOBAL innodb_log_archive=OFF,
trigger a write-ahead of the log if necessary, to prevent overrun.
Do not ignore errors from backup_end
dr-m and others added 13 commits April 17, 2026 14:22
This is an initial simple implementation which copies all the Aria files
in the "end" phase of the backup. Nothing protects the copy from
concurrent DDL or DML. Copying only works on MacOS (intended for
refactoring to use common file copy method across engines and SQL
layer).
Enable backup for non-Apple systems.
Copy non-Aria-specific files *.frm and db.opt as part of Aria backup.
scripts/mariabackup/mariabackup.sh: a drop-in wrapper that
lets existing mariabackup --backup invocations drive the server-side
BACKUP SERVER command without changing user scripts.

mariabackup.sh translates
--backup into "BACKUP SERVER TO '<dir>'" via the mariadb client,
forwarding connection options, and layers
--stream/--compress as tar/gzip pipelines on the result. It
validates the target directory up front and rejects the

- mbstream.sh shims the mbstream CLI onto tar, dropping
mbstream-only flags (-p/--parallel) so legacy pipelines keep working.
- README.md maps every supported --backup option to its BACKUP SERVER
equivalent and documents current limitations.

Add include/have_mariabackup_wrapper.inc redirects
$XTRABACKUP to the wrapper so a test opts in by sourcing one
file; skips when the wrapper, bash, or the mariadb client
is unavailable.

wrapper_basic.test : Exercises full backup, streaming, compression,
the ignored legacy options
@Thirunarayanan Thirunarayanan requested a review from dr-m May 28, 2026 09:52
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 28, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 3 committers have signed the CLA.

❌ mariadb-andrzejjarzabek
❌ dr-m
❌ Thirunarayanan
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces server-side backup support (BACKUP SERVER) for MariaDB, implementing backup and log archiving mechanisms for both the InnoDB and Aria storage engines. It also includes a compatibility wrapper script (mariabackup.sh) to map legacy mariabackup CLI commands to the new server-side SQL interface. The code review identified several critical issues, including a potential server crash in backup_innodb.cc due to invalid format arguments in error reporting, null pointer dereferences and assertion failures in both engines when backup steps are executed out of order or fail to initialize, and security vulnerabilities (command injection) and argument parsing bugs in the shell wrapper script.

{
fail:
fail= 1;
my_error(ER_ERROR_ON_RENAME, MYF(ME_ERROR_LOG), s, d, errno);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Passing s and d (which are int file descriptors on Unix) to my_error with the ER_ERROR_ON_RENAME format specifier will cause a segmentation fault/crash because ER_ERROR_ON_RENAME expects string arguments (const char*). They should be replaced with the actual file names.

Suggested change
my_error(ER_ERROR_ON_RENAME, MYF(ME_ERROR_LOG), s, d, errno);
my_error(ER_ERROR_ON_RENAME, MYF(ME_ERROR_LOG), "ib_logfile101", dst.c_str(), errno);

Comment on lines +347 to +352
int aria_backup_end(THD *thd, bool abort) noexcept
{
int ret_val= aria_backup->end(thd, abort);
aria_backup.reset();
return ret_val;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

If aria_backup_start was never called (e.g., because another plugin's backup_start failed earlier in plugin_foreach_with_mask), aria_backup will be nullptr. When plugin_foreach_with_mask subsequently calls backup_end for all plugins, aria_backup_end will dereference aria_backup without a null check, causing a server crash. A null check should be added.

int aria_backup_end(THD *thd, bool abort) noexcept
{
  if (!aria_backup)
    return 0;
  int ret_val= aria_backup->end(thd, abort);
  aria_backup.reset();
  return ret_val;
}

Comment on lines +195 to +198
int end(THD *thd, bool abort) noexcept
{
int fail= 0;
log_sys.latch.wr_lock();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If backup_start was never called or failed, max_first_lsn is 0. In this case, calling InnoDB_backup::end will trigger assertions like ut_ad(log_sys.archive) which can fail and crash the debug server. We should return early if max_first_lsn is 0.

  int end(THD *thd, bool abort) noexcept
  {
    int fail= 0;
    log_sys.latch.wr_lock();
    if (!max_first_lsn)
    {
      log_sys.latch.wr_unlock();
      return 0;
    }

Comment on lines +276 to +283
if (!max_first_lsn)
{
ut_ad(queue.empty());
if (old_size)
delete_logs();
logs.clear();
}
log_sys.latch.wr_unlock();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If backup_start was never called or failed, max_first_lsn is 0. In this case, calling InnoDB_backup::fini will check the transaction state and trigger the assertion ut_ad("invalid state" == 0) because the transaction state is not TRX_STATE_BACKUP. We should return early if max_first_lsn is 0.

Suggested change
if (!max_first_lsn)
{
ut_ad(queue.empty());
if (old_size)
delete_logs();
logs.clear();
}
log_sys.latch.wr_unlock();
if (!max_first_lsn)
{
ut_ad(queue.empty());
if (old_size)
delete_logs();
logs.clear();
log_sys.latch.wr_unlock();
return 0;
}
log_sys.latch.wr_unlock();

Comment on lines +315 to +328
if [[ -n "$FINAL_INCLUDE" ]]; then
echo "Setting backup_include='$FINAL_INCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_include='$FINAL_INCLUDE'"
fi

if [[ -n "$FINAL_EXCLUDE" ]]; then
echo "Setting backup_exclude='$FINAL_EXCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_exclude='$FINAL_EXCLUDE'"
fi

# Execute BACKUP SERVER
SQL="BACKUP SERVER TO '$TARGET_DIR'"
echo "Executing: $SQL" >&2
mariadb $MARIADB_OPTS -e "$SQL"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The parameters FINAL_INCLUDE, FINAL_EXCLUDE, and TARGET_DIR are interpolated into shell commands. To effectively prevent command injection, we should validate these inputs using a strict whitelist of allowed characters (such as alphanumeric characters and a limited set of safe symbols like '-', '_', '.') instead of relying on a blacklist or simple escaping.

Suggested change
if [[ -n "$FINAL_INCLUDE" ]]; then
echo "Setting backup_include='$FINAL_INCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_include='$FINAL_INCLUDE'"
fi
if [[ -n "$FINAL_EXCLUDE" ]]; then
echo "Setting backup_exclude='$FINAL_EXCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_exclude='$FINAL_EXCLUDE'"
fi
# Execute BACKUP SERVER
SQL="BACKUP SERVER TO '$TARGET_DIR'"
echo "Executing: $SQL" >&2
mariadb $MARIADB_OPTS -e "$SQL"
if [[ -n "$FINAL_INCLUDE" && ! "$FINAL_INCLUDE" =~ ^[a-zA-Z0-9_.-]+$ ]]; then
echo "Error: Invalid characters in FINAL_INCLUDE" >&2
exit 1
fi
References
  1. When validating input that will be interpolated into shell commands, use a strict whitelist of allowed characters (e.g., alphanumeric and a limited set of safe symbols like '-', '_', '.') instead of a blacklist to effectively prevent command injection.

TABLES_PATTERN=""
TABLES_EXCLUDE_PATTERN=""
TABLES_FILE=""
MARIADB_OPTS=""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Handling MARIADB_OPTS as a flat string will break if any option contains spaces or special characters (e.g., --password="my secret pass"). We should use a bash array instead.

Suggested change
MARIADB_OPTS=""
MARIADB_OPTS=()

Comment on lines +74 to +100
--user=*|--password=*|--host=*|--port=*|--socket=*)
# Connection options (long form) - pass through to mariadb client
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
;;
-u|-p|-h|-P|-S)
# Short form. If the next argv looks like a value (not another
# flag), consume it as well — supports both `-u USER` and bare
# `-p` (which mariadb client treats as a password prompt).
if [[ -n "${2-}" && "$2" != -* ]]; then
MARIADB_OPTS="$MARIADB_OPTS $1 $2"
shift 2
else
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
fi
;;
-u*|-p*|-h*|-P*|-S*)
# Bundled short form: -uUSER, -pSECRET, -hHOST, ...
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
;;
--defaults-file=*|--defaults-extra-file=*)
# Config file options - pass through to mariadb client
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
;;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

We should append connection options to the MARIADB_OPTS array instead of concatenating them into a flat string to correctly handle arguments with spaces or special characters.

        --user=*|--password=*|--host=*|--port=*|--socket=*)
            # Connection options (long form) - pass through to mariadb client
            MARIADB_OPTS+=("$1")
            shift
            ;;
        -u|-p|-h|-P|-S)
            # Short form. If the next argv looks like a value (not another
            # flag), consume it as well — supports both `-u USER` and bare
            # `-p` (which mariadb client treats as a password prompt).
            if [[ -n "${2-}" && "$2" != -* ]]; then
                MARIADB_OPTS+=("$1" "$2")
                shift 2
            else
                MARIADB_OPTS+=("$1")
                shift
            fi
            ;;
        -u*|-p*|-h*|-P*|-S*)
            # Bundled short form: -uUSER, -pSECRET, -hHOST, ...
            MARIADB_OPTS+=("$1")
            shift
            ;;
        --defaults-file=*|--defaults-extra-file=*)
            # Config file options - pass through to mariadb client
            MARIADB_OPTS+=("$1")
            shift
            ;;

Comment thread sql/sql_backup.cc
}

#ifndef _WIN32
int dir= open(target.str, O_DIRECTORY);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To be fully standard and portable, open with O_DIRECTORY should explicitly include the access mode O_RDONLY.

Suggested change
int dir= open(target.str, O_DIRECTORY);
int dir= open(target.str, O_RDONLY | O_DIRECTORY);

explicit Aria_backup(THD *thd, Target target) noexcept
: target(target)
#ifndef _WIN32
, datadir_fd(open(maria_data_root, O_DIRECTORY))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To be fully standard and portable, open with O_DIRECTORY should explicitly include the access mode O_RDONLY.

    , datadir_fd(open(maria_data_root, O_RDONLY | O_DIRECTORY))

@grooverdan
Copy link
Copy Markdown
Member

Hey @Thirunarayanan , have you thought of how this is going to end up in packaging? It is replacing the original mariadb-backup?

It needs some INSTALL/INSTALL_SCRIPT cmake directives around this to give it a install location and a cmake component. debian installation/packaging would need the relevant debian/{package}.install to include te script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants