@clsiebold PRISM VCS sounds like an interesting project which I'll be keeping an eye on. I just had a thought though as to whether using the ZFS filesystem and tooling could be used to knock up a rudimentary VCS where each project was a distinct ZFS volume and each commit was essentially a snapshot. Mounting two snapshots could be used to generate diffs and branches could perhaps be handles through some kind of metadata file or sqlitedb that associated branches with snapshots. ZFS would bring dedupe and compression a tagged release would simply be a snapshot ID. What are your (or anyone elses) thoughts?
2025-02-23 ยท 1 year ago ยท ๐ clseibold
6 Comments โ
๐ clseibold [๐] ยท 2025-02-23 at 19:23:
That's an interesting idea. I'm not sure it will work with everything that I have planned. I intend to create additional object types outside of blobs and trees.
The object store is actually the simplest part of a git-like VCS. The object store in Prism is already nearly complete, and it took like two days of work. So I'm not sure it's worth it at this point to completely switch it to something like this.
The object store in git and other similar VCSs do have dedup at least at the file level, in that the same file in multiple snapshots is only stored once based on its hash, and git has an additional level of compression with zlib pack files that rely on what is basically deduping byte strings.
And going even further, git also does delta compression in pack files. Mercurial also uses delta compression, but it does a trick where it intersperses the full files every so often so that it doesn't have to go from the very beginning of the history to reconstruct a file at a particular snapshot using the deltas.
I don't really like the idea of being dependent on a specific filesystem, in particular, to do the object store.
๐ stack ยท 2025-02-23 at 20:50:
My recent experience with ZFS ended with a total loss of data. I also lost a lot of data relying on a RAID a long while back. I am sure that is entirely my own fault (somehow), but I've never been able to improve on copying a drive to another drive every so often.
๐ drh3xx [OP] ยท 2025-02-23 at 23:26:
@clsiebold sorry for the confusion I was just wondering if you thought such a system might be workable in theory not to suggest you used ZFS in your own project. I do agree to an extent about the dependency on a particular FS being a negative. Just a thought that popped into my head.
๐ drh3xx [OP] ยท 2025-02-23 at 23:30:
@stack I'm surprised to hear that given ZFS's history and reputation. I suppose it could depending on the host OS have been some implementation bug or if you didn't use ECC RAM perhaps some corruption there could've caused your loss.
๐ clseibold [๐] ยท 2025-02-23 at 23:40:
@drh3xx I don't really know enough about ZFS to say if it would work for regular vcs. But you could try it to see if it works well.
๐ stack ยท 2025-02-24 at 01:19:
Like I said it's likely my own stupidity. I can't remember the details, but I think I tried to see something in a snapshot, and by the time I was done I lost a bunch of data.
That largely coincided with the end of my attempt to use FreeBSD as I got tired of working hard to get simple things to work. My needs are generally minimal, and I don't want to 'learn to be a sysadmin'...