Gocryptfs Use of Scrypt Introduces a Vulnerability That Renders the Entire Cryptographic Scheme Effectively Compromised

gocryptfs Nov 10, 2020
Preface: To any that are reading, this is a continuation of a former (now closed) issue that I pulled a few days ago, inquiring about the inclusion of Argon2 / outright replacement of Scrypt. For some reason (probably general fatigue), the interaction (on my part) was a bit more informal in nature than I would have preferred. That is not the case here. Additionally, this 'issue' was not pulled for the sake of slamming Scrypt as a KDF - but rather to convey to the developer of this program, specifically, that their insistence on doing so is not just at the cost of foregoing a 'better solution', but rather protecting users...in addition to providing a better solution comparatively to the implementation you're removing without any known exploits / compromises.


Scrypt is Exploitable in a Practical Way Via Cache Timing Attacks

Just to give some background (sure you already know, but for the sake of any follow-up readers):

"MHMix [function used by Scrypt for memory-hardness] takes the input block, hashes it many times while saving the hash results, and computes an output derived from some of the hash results that are chosen by interpreting certain hash values as indices. Since the hash values will be unique to each password input to scrypt, the indices of the hash values that are accessed and used in the output will also be unique to each password. Since the hash values that are needed to compute the true scrypt output are password-dependent, an adversary conducting a brute force dictionary attack against crypt will be unable to predict which values will contribute to the final output and will be forced to store all values to potentially be used, making array A in MHMix use large amounts of memory."
referencing a recent paper published by stanford titled, 'Attacking Scrypt via Cache Timing Side-Channel' (link = https://crypto.stanford.edu/cs359c/17sp/projects/MarkAnderson.pdf)

The paper goes on to outline the 'PRIME+PROBE' method, whcih the paper states is the "best technique to use against scrypt".

I won't waste too much time restating what is already contained in the paper, but in essence, an attacker is able to affect data in queued to be cached in a way that allows subsequent observations of said dat.

Specifically:

"First, the attacker flushes the victim's data from cache (the PRIME stage) by accessing data that it knows will evict thd victim's data from cache. Then, after waiting to give the victim an opportunity to acecss its data, the attacker attempts to access the data it put in cache (the PROBE stage). If the time to retrieve the data is relatively short, that means that the victim did not access its data. If the time to retreive the data is relatively long, the victim accessed the target data, which caused the attacker's data to be evicted from cache, forcing the attacker to wait for its data to be retrieved from a lower-level memory."

Problem Number One: Insignificant Threshold For Would-Be Attackers

All that was written above possesses no significance without information on how viable such an attack would be.

Fortunately, the study outlines this for us as well.

It states:

The 'PRIME+PROBE' attack can be used at any time the attacker shares some level of processor cache with the victim.

Point one above is satisfied in an instance where the attacker is able to run another process on the same machine that the user is running scrypt on.

This can even occur if the attacker and victim are "working on separate virtual machines that are hosted on the same machine".

It is not necessary that the victim & attacker be on the same core of a multi-core processor either (as I know that must VM solutions involve dedicating some # of cores to the VM while running).

The above points are not exhaustive insofar as the attacker just needs to have figured out some means to put themselves in position to be sharing the machine's cache memory

Stanford Paper Outlines this Vulnerability in Scrypt That Allows For its Exploitation

Essentially what is stated in the passage above is that the PBKDF2 execution (within Scrypt is what is effectively compromised in this timing attack).

Both the Stanford study and the informational RFC (7914) are very clear in their documentation of the KDF's operation that a compromise of this portion of the Scrypt scheme will allow an attacker to effectively compromise the entire algorithm:

Also, from the Stanford study:

"Learning enough information about the PBKDF2 hash of the victim's password allows an adversary to reduce an attack on scrypt to an attack on PBKDF2, thereby bypassing the memory-hardness of scrypt.

This then leads us to:

"More specifically, once the memory access pattern of scrypt is observed, the attacker can construct a dictionary of PBKDF2 hashes of potential passwords, compute what their access patterns will be, and compare the observed memory access patterns to the access patterns of the hashes in the dictionary."

At this point, we've now established that Scrypt can be exploited in a way that makes its inclusion in gocryptfs (beyond this point) questionable from many different perspectives.

Going Back to the Litecoin Miners That We Mentioned Earlier

If we're referring to an outputted hash from a successful Scrypt operation (that was not executed in a compromised environment), then sure, it is entirely implausible to suggest that even the entirety of the Litecoin network (hash rate-wise) would have the ability to brute force the password.

However, in lieu of what I've covered from the Stanford study regarding the trivial nature of the compromise of Scrypt's memory-hardness (which is supposed to be its key feature), we must now seriously consider the threat of this burgeoning network.

Why?

Per specifications, the output of the PBKDF2 function is piped into HMAC256 (via the two 32-bit output strings that would be created from the splitting of the 64-bit outputted by the PBKDF2 function).

This is critical to note because, when considering the information included above about timing attacks, it appears fair to assume that there is no 'random oracle' quality to the data extracted / observed by the attacker during the Scrypt hashing process

It Won't Be Litecoin - it Will Be BITCOIN Miners

Perhaps in hindsight my point about the Litecoin mining network was a bit irrational since they're mining Scrypt.

But by reducing the difficulty of 'cracking' the Scrypt to output to essentially brute forcing a PBKDF2(-SHA256) password (no salt) with default specified iterations, this seems like a job that would require trivial resource.

Available Commercial Hardware

Below is photo depicting Bitcoin mining hardware (top of the class, admittedly):

source: https://www.microbtwhatsminerd1.com/

This picture above shows us that this miner is capable of providing >400TH(s).

As reference, this represents >4.1 x 1013 hash operations PER second.

This Machine Only Costs $2,500

That's 'Joe Blow can buy this' prices.

I'm sure either you or I could go ahead and purchase this machine right now if we really wanted / needed.

So could many other folks in this world fortunate enough to be normally functioning, responsible enough adult to maintain a stable job in a 1st-world economy for longer than... 6 months, perhaps.

Scrypt Vulnerability is Equivalent to a Vulnerability Overall for Gocryptfs

Going back to Gocryptfs, its worth considering the role that 'Scrypt' plays in the larger picture of the cryptographic scheme that Gocryptfs provides for users.

Scrypt is used as the KDF for the password that must be input into the program in order to mount a gocryptfs container.

Thus, it is no stretch in logic to say that this timing attack on the cache renders the entirety of the gocryptfs scheme extremely vulnerable.

This is amplified by the fact that gocryptfs as a program executes itself as a bash script, meaning that any attacker that has access to the bash_history of a user's machine will be able to find out the:

  1. Location of the 'gocryptfs' container (which defeats the general practice of initializing a '.'-prefixed [hidden] file)
  2. Parameters for the scrypt KDF
  3. Whether or not there is a password file that's used
And countless other command line specifications that the user may elect to use for their implementation

Concluding On Why Scrypt KDF Should Be Replaced

The solution is to replace it with Argon2 (which was literally designed to address all of the issues with Scrypt like the ones that I named above).

This write-up here does not outline the characteristics of Argon2 or what about its construction, specifically, makes it a better alterantive to Scrypt.

That will be appended in a subsequent entry to this 'issue' that I have pulled here in a few hours when I'm able to find a quick 30 minutes to break that down (for the edification of any that may stumble upon this issue at some point in the future out of curiosity for how the [hopeful] evolution past Scrypt for Gocryptfs occured)

Tags

cryptomedication

Happy to serve and help wherever I'm needed in the blockchain space. #Education #EthicalContent #BringingLibretotheForefront

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.