rsync --delete with superset destination folder

Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
I have an rsync process which is syncing content from a source repository (which is version controlled) into a shared NFS mount.
The scenario (as awful as it is) is that the destination folder contains more content than the source folder because other content is synced to the destination folder from different sources. So for instance, folder structures may look like this:
source
a/a1.txt
a/a2.txt
b/b1.txt
destination
a/a1.txt
a/a2.txt
a/a3.txt
b/b1.txt
c/c1.txt
(in this example, a/a3.txt and c/c1.txt are synced to the destination from elsewhere. In practice this involves multiple other sources and the content/processes for these canâÂÂt be influenced.)
Now say that source folder deletes the a/a2.txt file. Using the existing setup, this file would not be deleted on the destination; but using --delete would result in other files being deleted, and it is a requirement to not do this.
How could --delete be used on this rsync but meet the requirement? Because the source directory is version controlled, it is simple enough to get a before-and-after of this directory, so a differential backup could be calculated using the original source directory as a reference, but is this the best way?
rsync
add a comment |Â
up vote
2
down vote
favorite
I have an rsync process which is syncing content from a source repository (which is version controlled) into a shared NFS mount.
The scenario (as awful as it is) is that the destination folder contains more content than the source folder because other content is synced to the destination folder from different sources. So for instance, folder structures may look like this:
source
a/a1.txt
a/a2.txt
b/b1.txt
destination
a/a1.txt
a/a2.txt
a/a3.txt
b/b1.txt
c/c1.txt
(in this example, a/a3.txt and c/c1.txt are synced to the destination from elsewhere. In practice this involves multiple other sources and the content/processes for these canâÂÂt be influenced.)
Now say that source folder deletes the a/a2.txt file. Using the existing setup, this file would not be deleted on the destination; but using --delete would result in other files being deleted, and it is a requirement to not do this.
How could --delete be used on this rsync but meet the requirement? Because the source directory is version controlled, it is simple enough to get a before-and-after of this directory, so a differential backup could be calculated using the original source directory as a reference, but is this the best way?
rsync
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I have an rsync process which is syncing content from a source repository (which is version controlled) into a shared NFS mount.
The scenario (as awful as it is) is that the destination folder contains more content than the source folder because other content is synced to the destination folder from different sources. So for instance, folder structures may look like this:
source
a/a1.txt
a/a2.txt
b/b1.txt
destination
a/a1.txt
a/a2.txt
a/a3.txt
b/b1.txt
c/c1.txt
(in this example, a/a3.txt and c/c1.txt are synced to the destination from elsewhere. In practice this involves multiple other sources and the content/processes for these canâÂÂt be influenced.)
Now say that source folder deletes the a/a2.txt file. Using the existing setup, this file would not be deleted on the destination; but using --delete would result in other files being deleted, and it is a requirement to not do this.
How could --delete be used on this rsync but meet the requirement? Because the source directory is version controlled, it is simple enough to get a before-and-after of this directory, so a differential backup could be calculated using the original source directory as a reference, but is this the best way?
rsync
I have an rsync process which is syncing content from a source repository (which is version controlled) into a shared NFS mount.
The scenario (as awful as it is) is that the destination folder contains more content than the source folder because other content is synced to the destination folder from different sources. So for instance, folder structures may look like this:
source
a/a1.txt
a/a2.txt
b/b1.txt
destination
a/a1.txt
a/a2.txt
a/a3.txt
b/b1.txt
c/c1.txt
(in this example, a/a3.txt and c/c1.txt are synced to the destination from elsewhere. In practice this involves multiple other sources and the content/processes for these canâÂÂt be influenced.)
Now say that source folder deletes the a/a2.txt file. Using the existing setup, this file would not be deleted on the destination; but using --delete would result in other files being deleted, and it is a requirement to not do this.
How could --delete be used on this rsync but meet the requirement? Because the source directory is version controlled, it is simple enough to get a before-and-after of this directory, so a differential backup could be calculated using the original source directory as a reference, but is this the best way?
rsync
edited May 30 at 10:43
asked May 30 at 10:38
cmbuckley
1114
1114
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
1
down vote
You cannot use rsync --delete like this. It's stateless and keeps no record of which files have been deleted between runs. The --delete flag simply instructs rsync to delete every file on the destination that does not exist on the source.
In order to implement this constrained deletion I think you need to maintain your own state. Neither rsync nor unison can do this for you.
The following is not a full error-safe solution; it's a starting point. (However, it does handle files with strange names - including those containing an embedded newline.)
Assume two directories src and dst. (For the purposes of the example it doesn't really matter whether dst is local or remote.)
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z | tee .state.src.new |
LC_ALL=C comm -z - -13 .state.src |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
If your version of comm (like mine) is older than GNU coreutils 8.25 and does not have the -z flag, you can use this alternative workaround:
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort | tee .state.src.new |
LC_ALL=C comm -13 - .state.src |
tr 'n' 'n' |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
I'm aware of this, but if I have my source directory under version control, then I have the state I need, it's just a question of whether--compare-destcould be used for this at all.
â cmbuckley
May 30 at 11:17
@cmbuckley try the new code I've added to the answer
â roaima
May 30 at 15:54
Reference Using comm with NULL-terminated records
â roaima
May 30 at 16:05
1
@cmbuckley--compare-destdoes not affect--deleteso that won't do what you want either. You will need the "before" state of the source, no way around that. If your version control tool can generate a commit diff for the entire directory just make one of those and apply the diff to the destination.
â jw013
May 30 at 20:08
1
@jw013 that sounds like a sufficiently different answer that you could reasonably make it one
â roaima
May 30 at 21:39
 |Â
show 2 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
You cannot use rsync --delete like this. It's stateless and keeps no record of which files have been deleted between runs. The --delete flag simply instructs rsync to delete every file on the destination that does not exist on the source.
In order to implement this constrained deletion I think you need to maintain your own state. Neither rsync nor unison can do this for you.
The following is not a full error-safe solution; it's a starting point. (However, it does handle files with strange names - including those containing an embedded newline.)
Assume two directories src and dst. (For the purposes of the example it doesn't really matter whether dst is local or remote.)
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z | tee .state.src.new |
LC_ALL=C comm -z - -13 .state.src |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
If your version of comm (like mine) is older than GNU coreutils 8.25 and does not have the -z flag, you can use this alternative workaround:
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort | tee .state.src.new |
LC_ALL=C comm -13 - .state.src |
tr 'n' 'n' |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
I'm aware of this, but if I have my source directory under version control, then I have the state I need, it's just a question of whether--compare-destcould be used for this at all.
â cmbuckley
May 30 at 11:17
@cmbuckley try the new code I've added to the answer
â roaima
May 30 at 15:54
Reference Using comm with NULL-terminated records
â roaima
May 30 at 16:05
1
@cmbuckley--compare-destdoes not affect--deleteso that won't do what you want either. You will need the "before" state of the source, no way around that. If your version control tool can generate a commit diff for the entire directory just make one of those and apply the diff to the destination.
â jw013
May 30 at 20:08
1
@jw013 that sounds like a sufficiently different answer that you could reasonably make it one
â roaima
May 30 at 21:39
 |Â
show 2 more comments
up vote
1
down vote
You cannot use rsync --delete like this. It's stateless and keeps no record of which files have been deleted between runs. The --delete flag simply instructs rsync to delete every file on the destination that does not exist on the source.
In order to implement this constrained deletion I think you need to maintain your own state. Neither rsync nor unison can do this for you.
The following is not a full error-safe solution; it's a starting point. (However, it does handle files with strange names - including those containing an embedded newline.)
Assume two directories src and dst. (For the purposes of the example it doesn't really matter whether dst is local or remote.)
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z | tee .state.src.new |
LC_ALL=C comm -z - -13 .state.src |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
If your version of comm (like mine) is older than GNU coreutils 8.25 and does not have the -z flag, you can use this alternative workaround:
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort | tee .state.src.new |
LC_ALL=C comm -13 - .state.src |
tr 'n' 'n' |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
I'm aware of this, but if I have my source directory under version control, then I have the state I need, it's just a question of whether--compare-destcould be used for this at all.
â cmbuckley
May 30 at 11:17
@cmbuckley try the new code I've added to the answer
â roaima
May 30 at 15:54
Reference Using comm with NULL-terminated records
â roaima
May 30 at 16:05
1
@cmbuckley--compare-destdoes not affect--deleteso that won't do what you want either. You will need the "before" state of the source, no way around that. If your version control tool can generate a commit diff for the entire directory just make one of those and apply the diff to the destination.
â jw013
May 30 at 20:08
1
@jw013 that sounds like a sufficiently different answer that you could reasonably make it one
â roaima
May 30 at 21:39
 |Â
show 2 more comments
up vote
1
down vote
up vote
1
down vote
You cannot use rsync --delete like this. It's stateless and keeps no record of which files have been deleted between runs. The --delete flag simply instructs rsync to delete every file on the destination that does not exist on the source.
In order to implement this constrained deletion I think you need to maintain your own state. Neither rsync nor unison can do this for you.
The following is not a full error-safe solution; it's a starting point. (However, it does handle files with strange names - including those containing an embedded newline.)
Assume two directories src and dst. (For the purposes of the example it doesn't really matter whether dst is local or remote.)
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z | tee .state.src.new |
LC_ALL=C comm -z - -13 .state.src |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
If your version of comm (like mine) is older than GNU coreutils 8.25 and does not have the -z flag, you can use this alternative workaround:
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort | tee .state.src.new |
LC_ALL=C comm -13 - .state.src |
tr 'n' 'n' |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
You cannot use rsync --delete like this. It's stateless and keeps no record of which files have been deleted between runs. The --delete flag simply instructs rsync to delete every file on the destination that does not exist on the source.
In order to implement this constrained deletion I think you need to maintain your own state. Neither rsync nor unison can do this for you.
The following is not a full error-safe solution; it's a starting point. (However, it does handle files with strange names - including those containing an embedded newline.)
Assume two directories src and dst. (For the purposes of the example it doesn't really matter whether dst is local or remote.)
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | LC_ALL=C sort -z | tee .state.src.new |
LC_ALL=C comm -z - -13 .state.src |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
If your version of comm (like mine) is older than GNU coreutils 8.25 and does not have the -z flag, you can use this alternative workaround:
# Find the current list of files (do this just once, to prep the cache)
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort > .state.src
Each time we perform a backup, run the following code
# Run the rsync to transfer files. "dst/" could be local
rsync -av src/ remote:dst/
# Determine the set of files to delete in "dst/"
( cd src && find . -type f -print0 ) | tr 'n' 'n' | LC_ALL=C sort | tee .state.src.new |
LC_ALL=C comm -13 - .state.src |
tr 'n' 'n' |
ssh remote 'while IFS= read -d "" -r f; do rm -f "dst/$f"; done'
# That seemed to work, so update the state cache
[[ 0 -eq $? ]] && mv -f .state.src.new .state.src
edited May 30 at 18:17
answered May 30 at 10:47
roaima
39.2k544105
39.2k544105
I'm aware of this, but if I have my source directory under version control, then I have the state I need, it's just a question of whether--compare-destcould be used for this at all.
â cmbuckley
May 30 at 11:17
@cmbuckley try the new code I've added to the answer
â roaima
May 30 at 15:54
Reference Using comm with NULL-terminated records
â roaima
May 30 at 16:05
1
@cmbuckley--compare-destdoes not affect--deleteso that won't do what you want either. You will need the "before" state of the source, no way around that. If your version control tool can generate a commit diff for the entire directory just make one of those and apply the diff to the destination.
â jw013
May 30 at 20:08
1
@jw013 that sounds like a sufficiently different answer that you could reasonably make it one
â roaima
May 30 at 21:39
 |Â
show 2 more comments
I'm aware of this, but if I have my source directory under version control, then I have the state I need, it's just a question of whether--compare-destcould be used for this at all.
â cmbuckley
May 30 at 11:17
@cmbuckley try the new code I've added to the answer
â roaima
May 30 at 15:54
Reference Using comm with NULL-terminated records
â roaima
May 30 at 16:05
1
@cmbuckley--compare-destdoes not affect--deleteso that won't do what you want either. You will need the "before" state of the source, no way around that. If your version control tool can generate a commit diff for the entire directory just make one of those and apply the diff to the destination.
â jw013
May 30 at 20:08
1
@jw013 that sounds like a sufficiently different answer that you could reasonably make it one
â roaima
May 30 at 21:39
I'm aware of this, but if I have my source directory under version control, then I have the state I need, it's just a question of whether
--compare-dest could be used for this at all.â cmbuckley
May 30 at 11:17
I'm aware of this, but if I have my source directory under version control, then I have the state I need, it's just a question of whether
--compare-dest could be used for this at all.â cmbuckley
May 30 at 11:17
@cmbuckley try the new code I've added to the answer
â roaima
May 30 at 15:54
@cmbuckley try the new code I've added to the answer
â roaima
May 30 at 15:54
Reference Using comm with NULL-terminated records
â roaima
May 30 at 16:05
Reference Using comm with NULL-terminated records
â roaima
May 30 at 16:05
1
1
@cmbuckley
--compare-dest does not affect --delete so that won't do what you want either. You will need the "before" state of the source, no way around that. If your version control tool can generate a commit diff for the entire directory just make one of those and apply the diff to the destination.â jw013
May 30 at 20:08
@cmbuckley
--compare-dest does not affect --delete so that won't do what you want either. You will need the "before" state of the source, no way around that. If your version control tool can generate a commit diff for the entire directory just make one of those and apply the diff to the destination.â jw013
May 30 at 20:08
1
1
@jw013 that sounds like a sufficiently different answer that you could reasonably make it one
â roaima
May 30 at 21:39
@jw013 that sounds like a sufficiently different answer that you could reasonably make it one
â roaima
May 30 at 21:39
 |Â
show 2 more comments
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f446876%2frsync-delete-with-superset-destination-folder%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password