LVM Snapshot without copy-on-write

up vote
2
down vote

favorite

I've been experimenting with LVM and how I can use it for managing data on my NFS server. With everything that I've read about snapshots, I am still unsure about how they perform in real life. Why does one need to allocate space in the snapshot if they are just a bunch of pointers to the original data? What is the point of the snapshot if modifying files on the origin also trigger modification on the snapshot via copy-on-write? I thought that the snapshot is supposed to be a "static" point in time of the original.

My expectation is such:

$ ls origin-data
> file1 file2
$ snapshot origin-data to origin-data-snapshot
$ modify origin-data and add new stuff
$ ls origin-data
> file1-modified file2 file3 file4
$ ls origin-data-snapshot
> file1 file2
$ sizeof origin-data-snapshot
> 0 bytes because they're all just pointers to blocks in origin-data!

If I'm misunderstanding, please explain and also explain how snapshots could be used in the way I'm expecting (like git commits, static, non-changing, pointers to data in a point in time that don't care about changes made to the origin). Does it involve RO or RW snapshots?

UPDATE: I've been experimenting with some test partitions and have a bit more understanding. While mounting both origin and it's snapshot, new files in origin obviously show up in something like df -h but not in the snapshot. Meanwhile, lvdisplay shows this percentage for "Allocated to snapshot" increasing. Using 10mb test files and 1gb test partitions, I see exactly how this percentage behaves in relation to my data, but why must this be so? Why does the new data show up on the snapshot and not origin? I would think the blocks behave like hard-links in that old data stays there because the snapshot points to it while new blocks are created next to them because origin points to the new and modified block. No?

edited Mar 11 '14 at 6:17

asked Mar 11 '14 at 5:36

brianclements

1335

add a commentÂ |Â

up vote
2
down vote

favorite

My expectation is such:

$ ls origin-data
> file1 file2
$ snapshot origin-data to origin-data-snapshot
$ modify origin-data and add new stuff
$ ls origin-data
> file1-modified file2 file3 file4
$ ls origin-data-snapshot
> file1 file2
$ sizeof origin-data-snapshot
> 0 bytes because they're all just pointers to blocks in origin-data!

edited Mar 11 '14 at 6:17

asked Mar 11 '14 at 5:36

brianclements

1335

add a commentÂ |Â

up vote
2
down vote

favorite

My expectation is such:

$ ls origin-data
> file1 file2
$ snapshot origin-data to origin-data-snapshot
$ modify origin-data and add new stuff
$ ls origin-data
> file1-modified file2 file3 file4
$ ls origin-data-snapshot
> file1 file2
$ sizeof origin-data-snapshot
> 0 bytes because they're all just pointers to blocks in origin-data!

edited Mar 11 '14 at 6:17

asked Mar 11 '14 at 5:36

brianclements

1335

My expectation is such:

$ ls origin-data
> file1 file2
$ snapshot origin-data to origin-data-snapshot
$ modify origin-data and add new stuff
$ ls origin-data
> file1-modified file2 file3 file4
$ ls origin-data-snapshot
> file1 file2
$ sizeof origin-data-snapshot
> 0 bytes because they're all just pointers to blocks in origin-data!

backup lvm snapshot

edited Mar 11 '14 at 6:17

asked Mar 11 '14 at 5:36

brianclements

1335

edited Mar 11 '14 at 6:17

asked Mar 11 '14 at 5:36

brianclements

1335

edited Mar 11 '14 at 6:17

asked Mar 11 '14 at 5:36

brianclements

1335

asked Mar 11 '14 at 5:36

brianclements

1335

asked Mar 11 '14 at 5:36

brianclements

1335

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
5
down vote

accepted

The cost of a snapshot cannot possibly be zero bytes. When a block is changed in the source volume, and you have a snapshot, a copy of the original block prior to modification must be made - the original data must be available somehwere so that it's accessible from the snapshot.

That's what the snapshot size is (plus some metadata): original copies of blocks that have since been changed in the source.

Note that it might be an "accounting trick": an implementation could choose not to overwrite the original block on disk, but rather store the new data somewhere else and update the source block list (or whatever it is it uses to track). In this case the snapshot is "static" as per your definition. But it still causes the overall number of allocated blocks to grow whenever a source block is modified. This space usage should be (an is) accounted against the snapshot.

This is true both for RO and RW snapshots, except that it's a bit more complex in the RW case (you don't want to overwrite a block that was modified in the snapshot by an original block from the source if that is modified too, for example).

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

This pretty much answers it, that makes sense about old data that is overwritten on origin being pulled to the snapshot to maintain it's moment in time. But what then about creating new data on origin? I still see the "Allocated to snapshot" percentage increase on lvdisplay on the snapshot when I continue to create brand new files to origin. Why doesn't that just count against origin?
â€“Â brianclements
Mar 11 '14 at 6:23

1

The snapshot doesn't work at the filesystem level, but at the block level. LVM doesn't know/understand the filesystem that sits on top of it, so it has to copy any block that is modified in the source to preserve it. That includes the blocks that were modified just for metadata (wherever the FS stored the fact that there is a new file), and all the newly touched data blocks in the source. A filesystem-level snapshot would (most likely) have different characteristics in this scenario.
â€“Â Mat
Mar 11 '14 at 6:38

Ah OK. So LVM simply cannot tell the difference between a file modification and a new file, it just treats them all the same and puts new/changed blocks in the snapshot.
â€“Â brianclements
Mar 11 '14 at 7:00

add a commentÂ |Â

up vote
0
down vote

I just looked into this topic, like the OP, the core point of confusion stemmed from "thinking in files" while LVM works with physical extents.

Usually, LVM is located between the HDD and a file system, each of these three layers has its own term for the concept of "equally sized chunks of bytes":

hdd: sectors (512 bytes) -> LVM: physical extents (4MB) -> file system: blocks (e.g. 4K)

I created a 200MB large loop device, 100MB for a logical volume (testlv) and 60MB for a snapshot LV (snaplv).

The 100MB LV can be thought of as consisting of 25 physical extents, each representing 4MB worth of file system blocks. The snapshot LV initially also references these PEs, it does not use its own 15 PEs at this point. Whenever the user writes to either logical volume's file system, the file system will change the contents of one or more blocks, which of course are themselves stored in LVM physical extents.

Modifying a PE from testlv therefore means:

copy the contents of the PE to one of the spare snaplv PEs (copy-on-write)

change snaplv's reference to this "new" PE

update the contents of the "original" testlv PE

Obviously, changing a PE from snaplv is almost the same, only the final step differs in that it is snaplv's copy of PE that will be updated.

answered 27 mins ago

T Nierath

1011

New contributor

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f119075%2flvm-snapshot-without-copy-on-write%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
5
down vote

accepted

That's what the snapshot size is (plus some metadata): original copies of blocks that have since been changed in the source.

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

This pretty much answers it, that makes sense about old data that is overwritten on origin being pulled to the snapshot to maintain it's moment in time. But what then about creating new data on origin? I still see the "Allocated to snapshot" percentage increase on lvdisplay on the snapshot when I continue to create brand new files to origin. Why doesn't that just count against origin?
â€“Â brianclements
Mar 11 '14 at 6:23

1

The snapshot doesn't work at the filesystem level, but at the block level. LVM doesn't know/understand the filesystem that sits on top of it, so it has to copy any block that is modified in the source to preserve it. That includes the blocks that were modified just for metadata (wherever the FS stored the fact that there is a new file), and all the newly touched data blocks in the source. A filesystem-level snapshot would (most likely) have different characteristics in this scenario.
â€“Â Mat
Mar 11 '14 at 6:38

Ah OK. So LVM simply cannot tell the difference between a file modification and a new file, it just treats them all the same and puts new/changed blocks in the snapshot.
â€“Â brianclements
Mar 11 '14 at 7:00

add a commentÂ |Â

up vote
5
down vote

accepted

That's what the snapshot size is (plus some metadata): original copies of blocks that have since been changed in the source.

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

This pretty much answers it, that makes sense about old data that is overwritten on origin being pulled to the snapshot to maintain it's moment in time. But what then about creating new data on origin? I still see the "Allocated to snapshot" percentage increase on lvdisplay on the snapshot when I continue to create brand new files to origin. Why doesn't that just count against origin?
â€“Â brianclements
Mar 11 '14 at 6:23

1

The snapshot doesn't work at the filesystem level, but at the block level. LVM doesn't know/understand the filesystem that sits on top of it, so it has to copy any block that is modified in the source to preserve it. That includes the blocks that were modified just for metadata (wherever the FS stored the fact that there is a new file), and all the newly touched data blocks in the source. A filesystem-level snapshot would (most likely) have different characteristics in this scenario.
â€“Â Mat
Mar 11 '14 at 6:38

Ah OK. So LVM simply cannot tell the difference between a file modification and a new file, it just treats them all the same and puts new/changed blocks in the snapshot.
â€“Â brianclements
Mar 11 '14 at 7:00

add a commentÂ |Â

up vote
5
down vote

accepted

That's what the snapshot size is (plus some metadata): original copies of blocks that have since been changed in the source.

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

That's what the snapshot size is (plus some metadata): original copies of blocks that have since been changed in the source.

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

answered Mar 11 '14 at 6:02

Mat

38.2k7117124

This pretty much answers it, that makes sense about old data that is overwritten on origin being pulled to the snapshot to maintain it's moment in time. But what then about creating new data on origin? I still see the "Allocated to snapshot" percentage increase on lvdisplay on the snapshot when I continue to create brand new files to origin. Why doesn't that just count against origin?
â€“Â brianclements
Mar 11 '14 at 6:23

1

The snapshot doesn't work at the filesystem level, but at the block level. LVM doesn't know/understand the filesystem that sits on top of it, so it has to copy any block that is modified in the source to preserve it. That includes the blocks that were modified just for metadata (wherever the FS stored the fact that there is a new file), and all the newly touched data blocks in the source. A filesystem-level snapshot would (most likely) have different characteristics in this scenario.
â€“Â Mat
Mar 11 '14 at 6:38

Ah OK. So LVM simply cannot tell the difference between a file modification and a new file, it just treats them all the same and puts new/changed blocks in the snapshot.
â€“Â brianclements
Mar 11 '14 at 7:00

add a commentÂ |Â

This pretty much answers it, that makes sense about old data that is overwritten on origin being pulled to the snapshot to maintain it's moment in time. But what then about creating new data on origin? I still see the "Allocated to snapshot" percentage increase on lvdisplay on the snapshot when I continue to create brand new files to origin. Why doesn't that just count against origin?
â€“Â brianclements
Mar 11 '14 at 6:23

1

The snapshot doesn't work at the filesystem level, but at the block level. LVM doesn't know/understand the filesystem that sits on top of it, so it has to copy any block that is modified in the source to preserve it. That includes the blocks that were modified just for metadata (wherever the FS stored the fact that there is a new file), and all the newly touched data blocks in the source. A filesystem-level snapshot would (most likely) have different characteristics in this scenario.
â€“Â Mat
Mar 11 '14 at 6:38

Ah OK. So LVM simply cannot tell the difference between a file modification and a new file, it just treats them all the same and puts new/changed blocks in the snapshot.
â€“Â brianclements
Mar 11 '14 at 7:00

This pretty much answers it, that makes sense about old data that is overwritten on origin being pulled to the snapshot to maintain it's moment in time. But what then about creating new data on origin? I still see the "Allocated to snapshot" percentage increase on lvdisplay on the snapshot when I continue to create brand new files to origin. Why doesn't that just count against origin?
â€“Â brianclements
Mar 11 '14 at 6:23

The snapshot doesn't work at the filesystem level, but at the block level. LVM doesn't know/understand the filesystem that sits on top of it, so it has to copy any block that is modified in the source to preserve it. That includes the blocks that were modified just for metadata (wherever the FS stored the fact that there is a new file), and all the newly touched data blocks in the source. A filesystem-level snapshot would (most likely) have different characteristics in this scenario.
â€“Â Mat
Mar 11 '14 at 6:38

Ah OK. So LVM simply cannot tell the difference between a file modification and a new file, it just treats them all the same and puts new/changed blocks in the snapshot.
â€“Â brianclements
Mar 11 '14 at 7:00

add a commentÂ |Â

up vote
0
down vote

I just looked into this topic, like the OP, the core point of confusion stemmed from "thinking in files" while LVM works with physical extents.

Usually, LVM is located between the HDD and a file system, each of these three layers has its own term for the concept of "equally sized chunks of bytes":

hdd: sectors (512 bytes) -> LVM: physical extents (4MB) -> file system: blocks (e.g. 4K)

I created a 200MB large loop device, 100MB for a logical volume (testlv) and 60MB for a snapshot LV (snaplv).

Modifying a PE from testlv therefore means:

copy the contents of the PE to one of the spare snaplv PEs (copy-on-write)

change snaplv's reference to this "new" PE

update the contents of the "original" testlv PE

Obviously, changing a PE from snaplv is almost the same, only the final step differs in that it is snaplv's copy of PE that will be updated.

answered 27 mins ago

T Nierath

1011

New contributor

add a commentÂ |Â

up vote
0
down vote

I just looked into this topic, like the OP, the core point of confusion stemmed from "thinking in files" while LVM works with physical extents.

Usually, LVM is located between the HDD and a file system, each of these three layers has its own term for the concept of "equally sized chunks of bytes":

hdd: sectors (512 bytes) -> LVM: physical extents (4MB) -> file system: blocks (e.g. 4K)

I created a 200MB large loop device, 100MB for a logical volume (testlv) and 60MB for a snapshot LV (snaplv).

Modifying a PE from testlv therefore means:

copy the contents of the PE to one of the spare snaplv PEs (copy-on-write)

change snaplv's reference to this "new" PE

update the contents of the "original" testlv PE

Obviously, changing a PE from snaplv is almost the same, only the final step differs in that it is snaplv's copy of PE that will be updated.

answered 27 mins ago

T Nierath

1011

New contributor

add a commentÂ |Â

up vote
0
down vote

I just looked into this topic, like the OP, the core point of confusion stemmed from "thinking in files" while LVM works with physical extents.

Usually, LVM is located between the HDD and a file system, each of these three layers has its own term for the concept of "equally sized chunks of bytes":

hdd: sectors (512 bytes) -> LVM: physical extents (4MB) -> file system: blocks (e.g. 4K)

I created a 200MB large loop device, 100MB for a logical volume (testlv) and 60MB for a snapshot LV (snaplv).

Modifying a PE from testlv therefore means:

copy the contents of the PE to one of the spare snaplv PEs (copy-on-write)

change snaplv's reference to this "new" PE

update the contents of the "original" testlv PE

Obviously, changing a PE from snaplv is almost the same, only the final step differs in that it is snaplv's copy of PE that will be updated.

answered 27 mins ago

T Nierath

1011

New contributor

I just looked into this topic, like the OP, the core point of confusion stemmed from "thinking in files" while LVM works with physical extents.

Usually, LVM is located between the HDD and a file system, each of these three layers has its own term for the concept of "equally sized chunks of bytes":

hdd: sectors (512 bytes) -> LVM: physical extents (4MB) -> file system: blocks (e.g. 4K)

I created a 200MB large loop device, 100MB for a logical volume (testlv) and 60MB for a snapshot LV (snaplv).

Modifying a PE from testlv therefore means:

copy the contents of the PE to one of the spare snaplv PEs (copy-on-write)

change snaplv's reference to this "new" PE

update the contents of the "original" testlv PE

Obviously, changing a PE from snaplv is almost the same, only the final step differs in that it is snaplv's copy of PE that will be updated.

answered 27 mins ago

T Nierath

1011

New contributor

answered 27 mins ago

T Nierath

1011

New contributor

answered 27 mins ago

T Nierath

1011

answered 27 mins ago

T Nierath

1011

New contributor

T Nierath is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu