Just wanted to suggest an option to add xxHash as an alternative FileCheck checksum algorithm.
xxHash is quickly becoming the defacto standard for checksumming media files, at least in the video post-production industry. It's significantly faster than MD5, which would be a great benefit to the usability of the FileCheck indexing option in NeoFinder.
I actually rarely use FileCheck—even though I want to—because of the massive overhead in time that's involved with generating the MD5 checksums. I usually end up using an external xxHash checksum generation tool to create standalone checksum logs for data sets. But it would be much more convenient if I could just generate the xxHash checksums at the same time that I'm indexing the data in NeoFinder.
This whitepaper by the Digital Preservation Coalition mentions xxHash as a worthy checksum algorithm for large audio/visual media.
This Github page for the xxHash project can be found here
Adding xxHash as a FileCheck algorithm?
-
- Site Admin
- Posts: 298
- Joined: Tue Mar 08, 2022 3:10 pm
Re: Adding xxHash as a FileCheck algorithm?
Hello Mel,
that is a very interesting question. We will definitely look into the xxHash checksum. It is good to know that post production studios are now using this, and we definitely find this very interesting.
What variant of the xxHash is being used there? xxh3 with 64 or 128 bit?
I do doubt, though, that it will be much faster than the standard md5 that NeoFinder currently uses for the FileCheck values. The current md5 implementation in NeoFinder is already 45% faster than the standard macOS command line tool, but that speed gain was only possible by using more efficient I/O queuing, and the bottleneck in our performance studies was always the networking speed of the server, and the somewhat lackluster performance of the macOS SMB client for files on a server...
that is a very interesting question. We will definitely look into the xxHash checksum. It is good to know that post production studios are now using this, and we definitely find this very interesting.
What variant of the xxHash is being used there? xxh3 with 64 or 128 bit?
I do doubt, though, that it will be much faster than the standard md5 that NeoFinder currently uses for the FileCheck values. The current md5 implementation in NeoFinder is already 45% faster than the standard macOS command line tool, but that speed gain was only possible by using more efficient I/O queuing, and the bottleneck in our performance studies was always the networking speed of the server, and the somewhat lackluster performance of the macOS SMB client for files on a server...
-
- Posts: 5
- Joined: Tue Feb 28, 2023 5:28 pm
Re: Adding xxHash as a FileCheck algorithm?
In media production xxHash64be is generally the defacto standard. It's supported by nearly all of the major file offloading tools for post-production, such as Silverstack, ShotPutPro, Hedge and YoYotta.
The speed benefits of xxHash are much more pronounced the faster the I/O is, because the hashing algorithm runs nearly as fast as the speed of your system RAM, and isn't CPU bound like MD5. Lots of video professionals are running SSD RAIDs, so MD5 can be a (literal) drag when checksumming large files.
The speed benefits of xxHash are much more pronounced the faster the I/O is, because the hashing algorithm runs nearly as fast as the speed of your system RAM, and isn't CPU bound like MD5. Lots of video professionals are running SSD RAIDs, so MD5 can be a (literal) drag when checksumming large files.
-
- Site Admin
- Posts: 298
- Joined: Tue Mar 08, 2022 3:10 pm
Re: Adding xxHash as a FileCheck algorithm?
Awesome, thanks, we will check it out!
Who is online
Users browsing this forum: No registered users and 0 guests