So I use gitea to handle my git repositories, because reasons. Backing up your data is important, and gitea supports dumping it’s data which is nice for backing it up.

Below is a simple script I put together to handle these dump files in a reliable way. It simply copies the dump file to a separate directory and keeps only the latest 10 zip files. I later copy these to another computer for safe keeping using syncthing…because reasons.

Hopefully this is useful for someone else, so that someone else doesn’t have to write a similar script themselves.

#!/usr/bin/env python3

import pathlib
import shutil
import os

import logging

logger = logging.getLogger()
handler = logging.StreamHandler()
formatter = logging.Formatter('%(levelname)-6s %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)

dump_directory = "/var/lib/gitea"
backup_directory = "/path/to/backup/folder/gitea"

path_obj = pathlib.Path(dump_directory)
zip_files = path_obj.glob("gitea-dump-*.zip")

for f in zip_files:
    logger.debug("Moving %s to backup directory...", f)
    try:
        shutil.move(str(f), backup_directory)
    except shutil.Error:
        logger.debug("File already exists!")
        try:
            os.remove(str(f))
            logger.debug("Removed duplicate zip file")
        except:
            logger.warning("Failed to remove duplicate zip")

# store the last 10 snapshots
backup_obj = pathlib.Path(backup_directory)
backups = [f for f in backup_obj.glob("*.zip")]

# Sort by modified time, extract the oldest 10
backups_sorted = sorted(backups, key=lambda f: os.stat(f).st_mtime)
old_snapshots = backups_sorted[:10][:10]
keepers = backups_sorted[-10:]

logger.info("Found %s snapshots", len(backups_sorted))

snapshot_complement = [s for s in old_snapshots
                       if s not in keepers]

for snapshot in snapshot_complement:
    logger.info("Removing %s", snapshot)
    os.remove(snapshot)

And since I run NixOS on most of my machines, this is the service declaration for running this script every night:

systemd.services.backupGitea =
  {
    enable = true;
    description = "backup gitea";
    path = [ pkgs.python3 ];
    serviceConfig =
      {
        Type = "oneshot";
        ExecStart = "${pkgs.python3}/bin/python3 /path/to/gitea_backup.py";
      };
    wantedBy = [ "default.target" ];
  };

systemd.timers.backupGitea =
  {
    description = "backup gitea";
    partOf = [ "backupGitea.service" ];
    wantedBy = [ "timers.target" ];
    timerConfig.OnCalendar = "04:45";
  };