GitHub Backup Process Overview
GitProtect is built for GitHub protection too. To be sure that all of your GitHub environment is reliably protected make sure to backup all repositories with related metadata. The best practice is to create a backup plan for critical repositories and metadata that change on the daily basis (or even more frequently) for example using recommended Grandfather-Father-Son (GFS) rotation scheme and another backup plan for unused repositories that you need to keep for any future reference. This kind of backup is required more for GitHub archive goals and due to unlimited retention, you can store your copies for as long as you need – even infinitely. Moreover, you can even delete those repositories from your GitHub account and keep the copy on storage to bypass GitHub limits.
Backup type
Incremental and differential backups save your storage space. Your backup software should include only changed blocks of your GitHub data since the last copy to reduce the backup size on your storage, speed up backup and limit bandwidth. Moreover, in the perfect scenario, you should be able to define different retention and performance schemes for every type of copy (full, incremental, and differential).
Adding multiple storage instances
Use different types of storages to replicate backups between storages, eliminate any outage or disaster risk and meet the 3-2-1 backup rule. It says that you should have at least 3 copies on 2 different storage instances with at least 1 in the cloud. GitProtect is a multi-storage system. It allows you to store your data:
in the cloud (GitProtect Cloud, AWS S3, Wasabi Cloud, Backblaze B2, Google Cloud Storage, Azure Blob Storage, and any public cloud compatible with S3),
locally (NFS, CIFS, SMB network shares, local disk resources),
in hybrid environment/multi-cloud
Create a dedicated GitHub user
The best practice for big, enterprise users is to create a dedicated GitHub user account that will be connected to GitHub backup software and responsible only for backup purposes (ie. backup@companyname.com). It is due to two reasons – but security first. It means that this user should have access only to repositories it aims to protect. It also helps to bypass throttling – each GitHub user has his own pool of requests to the GitHub API – so every application associated with this account operates on the same number of requests. Thus, the separate user enables them to bypass these limits and perform backup smoothly without any queuing or delay. If you manage a big organization and numerous repositories it is good to have even several GitHub users dedicated to backup purposes within your GitHub account – when the first one exhausts the number of requests to the API, the next one is automatically attached, and so on. Then the backup of even the biggest GitHub environment performs uninterruptedly.
Last updated