-
Notifications
You must be signed in to change notification settings - Fork 483
chore(backup): add s3 backup script #818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8ff6953
to
01448f7
Compare
Successfully tested on dev-fossology |
b976e33
to
7245350
Compare
a4fbdd8
to
46143d5
Compare
c8e9cd2
to
275bdbe
Compare
utils/backup/S3/fo-backup-s3
Outdated
fi | ||
|
||
if [[ ${1} = "-l" ]]; then | ||
LOGGING=${2} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to call this variable LOG
or LOGFILE
Please do not create wiki pages before the corresponding features are merged. |
2cfa4c3
to
7e0d014
Compare
8edf22a
to
cc4e2be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments
|
||
echo Start database dump at `date` | ||
#### postgresql needs to be restarted for the dump to work | ||
/etc/init.d/postgresql restart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just /etc/init.d/postgresql start
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it needs to be restarted for the dump to work. I don't think start would work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not know why this should be necessary (except for reloading the config, which you do not change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe this forces the DB to write everything out...
utils/backup/S3/fo-backup-s3
Outdated
|
||
#### sync repository | ||
echo "Uploading and starting backup to S3..." | ||
tar -zcvf - /srv/fossology/ | ${DIR}/venv/bin/aws --debug s3 cp --expected-size 200000000000 --sse AES256 - "s3://${BUCKET}/${FILENAME}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--sse AES256
is the default value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but maybe the default value will change overtime and then it wouldn't work anymore. What do you think?
utils/backup/S3/fo-backup-s3
Outdated
|
||
|
||
echo Creating diff of repository at `date` | ||
ls -RlahGg --time-style='+' /srv/fossology/repository/ | cut -d ' ' -f 2-5 | ${DIR}/venv/bin/aws s3 cp --sse AES256 - "s3://${BUCKET}/${FILENAME}-repo-diff.txt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this recursive? This cut
splits folders with spaces in their names in arbitrary places.
Why do you keep the inode-count?
Please add variables for magic strings like "s3://${BUCKET}/${FILENAME}-repo-diff.txt"
which are used in multiple places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splits folders with spaces in their names in arbitrary places
No it just splits the column. There is a tab in there, not spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are listing recursively, the ls -RlahGg --time-style='+'
also lists the names of the corresponding folders as the first "word" in a line (for sub directories). If they contain spaces you are splitting them and keeping a part of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you happen to know a way to remove the inode-count, besides cut?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example:
$ tree ./ /tmp/test
./
├── Folder with splaces
│ └── file
└── otherFile
and
$ ls -RlahGg --time-style='+' . | cut -d ' ' -f 2-5 /tmp/test
.:
20K
3 4,0K
16 12K
2 4,0K
1
with splaces:
8,0K
2 4,0K .
3 4,0K ..
1
Note:
here you also see that the ls -RlahGg --time-style='+'
contains on my system spaces and not tabs => the filenames are not listed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right. Also we could also just use tree for the diff, right? Only filesize would be gone then. Or what do you think should be the best way to approach this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want a list of all files together with their sizes use something like
find /srv/fossology/repository/ -type f -exec du -h {} \;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
utils/backup/S3/fo-restore-s3
Outdated
|
||
echo Checking diff of repository at `date` | ||
DIFF=`${DIR}/venv/bin/aws s3 cp --sse AES256 s3://${BUCKET}/${FILENAME}-repo-diff.txt -` | ||
CURRENT_DIFF=`ls -RlahGg --time-style='+' /srv/fossology/repository/ | cut -d ' ' -f 2-5` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are calling ls -RlahGg --time-style='+' /srv/fossology/repository/
at multiple places, please move them to an function.
|
||
echo Start removing old data at `date` | ||
|
||
${DIR}/../../fo-cleanold --delete-repository --delete-database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might need to ensure that the database is running.
This might delete productive data, you should display a warning and give the option to exit.
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. | ||
|
||
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" | ||
source ${DIR}/before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are sourcing the before
before setting set -x
. You can't be sure that $FILENAME
is filled and that apache is stopped
utils/backup/S3/fo-restore-s3
Outdated
su postgres -c "psql -f ${TMP_FILE}" | ||
echo --------- End database log --------- | ||
else | ||
su postgres -c "psql -f ${TMP_FILE}" > /dev/null 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you also suppress STDERR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because there are many errors that come up, because tables already exist etc...
utils/backup/S3/fo-backup-s3
Outdated
#### postgresql needs to be restarted for the dump to work | ||
/etc/init.d/postgresql restart | ||
#### these steps are split, because of permissions | ||
TMP_FILE=`mktemp` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already done in before
034a6ff
to
ff3bb1a
Compare
0e75722
to
20f476f
Compare
fc88e17
to
e08f5bd
Compare
utils/backup/S3/fo-restore-s3
Outdated
echo "Please specify a file, where to save the log file to" | ||
exit 1 | ||
fi | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is confusing here
utils/backup/S3/fo-restore-s3
Outdated
exit 1 | ||
fi | ||
else | ||
exec &> >(tee -a ${LOGGING}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you only call this command if no argument "-l"
was passed. That looks wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same in the other file
@max-wittig how about writing the status of this pull request as of now? |
The Pull Request is working, but it's not possible to run the restore and backup script at the same time. |
e08f5bd
to
cc7e6d9
Compare
cc7e6d9
to
4716837
Compare
Wiki page: S3-Backup