Backing Up GitHub Repositories to Amazon S3 (What Nobody Warns You About)
Source: Dev.to
Why Back up GitHub repositories?
GitHub can become a single point of failure for work you care about. A deleted repository, a locked account, or a force‑push can wipe everything out. An off‑platform, automated backup solves this problem.
Why Amazon S3?
- Independent of GitHub
- Cheap
- Extremely durable
- Built for long‑term storage
What this guide covers
- Backing up multiple GitHub repositories
- Running backups weekly
- Preserving full Git history (branches + tags)
- Using OIDC + temporary credentials (no long‑lived AWS access keys)
- Storing backups safely in Amazon S3
Git bundle vs. ZIP
ZIP backups
- ❌ Lose commit history
- ❌ Drop branches and tags
- ❌ Painful to restore correctly
Git bundle (single portable file)
- Contains all commits, branches, and tags
git bundle create repo-backup.bundle --all
If a backup can’t restore history, it isn’t a backup.
AWS permissions: common pitfalls
AWS uses two policy types:
| Policy type | Used for | Requires Principal |
|---|---|---|
| IAM role policy | Identity permissions | ❌ |
| S3 bucket policy | Resource permissions | ✅ |
A common error is pasting an IAM role policy into an S3 bucket policy or using the wrong principal ARN.
Upload succeeds only if BOTH are true
- The IAM role policy allows the action.
- The S3 bucket policy allows the same role.
If either side denies it → AccessDenied.
GitHub Actions workflow
name: Weekly S3 Repo Backup
on:
schedule:
- cron: "15 3 * * 0" # Weekly
workflow_dispatch: {}
permissions:
id-token: write
contents: read
jobs:
backup:
runs-on: ubuntu-latest
steps:
- name: Checkout full history
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Create git bundle
run: |
set -e
REPO_NAME="${GITHUB_REPOSITORY#*/}"
TS="$(date -u +%Y-%m-%dT%H-%M-%SZ)"
mkdir -p backups
git bundle create "backups/${REPO_NAME}-${TS}.bundle" --all
sha256sum "backups/${REPO_NAME}-${TS}.bundle" > "backups/${REPO_NAME}-${TS}.sha256"
- name: Configure AWS credentials (OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume:
aws-region:
- name: Upload to S3
run: |
aws s3 cp backups/ \
s3:///github-backups/${GITHUB_REPOSITORY}/ \
--recursive
Minimal Terraform configuration
resource "aws_iam_openid_connect_provider" "github" {
url = "https://token.actions.githubusercontent.com"
client_id_list = [
"sts.amazonaws.com"
]
thumbprint_list = [
"6938fd4d98bab03faadb97b34396831e3780aea1"
]
}
resource "aws_iam_role" "github_backup" {
name = "github-actions-s3-backup"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = "sts:AssumeRoleWithWebIdentity"
Principal = {
Federated = aws_iam_openid_connect_provider.github.arn
}
Condition = {
StringLike = {
"token.actions.githubusercontent.com:sub" = "repo:*/*:*"
}
}
}]
})
}
resource "aws_iam_role_policy" "s3_backup" {
role = aws_iam_role.github_backup.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = ["s3:ListBucket"]
Resource = "arn:aws:s3:::example-backup-bucket"
},
{
Effect = "Allow"
Action = [
"s3:PutObject",
"s3:AbortMultipartUpload"
]
Resource = "arn:aws:s3:::example-backup-bucket/*"
}
]
})
}
Restoring a backup
git clone repo-backup.bundle restored-repo
cd restored-repo
git push --all origin
git push --tags origin
No GitHub API is required.
Best‑practice checklist
- Sketch trust relationships before writing policies.
- Don’t trust AWS error messages blindly.
- Never use the root account as a bucket principal.
- Test with one repository before scaling.
- Keep backups boring and reliable.
A good backup system is something you forget about until the day you need it—then it should just work.