There's been...a lot going on with Ethereum lately.
Between the push over to "Ethereum 2.0", the most recent infestation of 'DeFi projects', Infura.io API being faulty plus the potential closures of several exchanges in the space (which serve as potential flash points), there's no telling where Ethereum will end up in the near future.
However, in the long-term, the project's outlook appears extremely bleak.
Long-Term Perspective One: Walled-Off Ecosystem
This title does not mean that this report projects that Ethereum will possess a walled-off ecosystem in the future, but rather that there is one right now.
Some of this is due to the way that various projects are oriented. Much of it, however, is due to the enormous costs associated with running a full node.
Importance of Running a Full Node
Running a full node is the only way to operate with blockchain in a truly trustless manner. Therefore, one of the primary focuses of communities like Ethereum should be on ensuring that accessing and leveraging the blockchain as a full-node becomes gradually more feasible in some capacity (this can also be augmented with expansions to the ecosystem that utilize the literal protocol layer [where the nodes are at] to bolster another layer of interoperability).
However, at the time of writing, this is far from the case. In fact, if anything, it has become increasingly difficult for users to run a full, archival node on the Ethereum blockchain.
Archival Node's Resource-Hungry Demands
At the time of writing, one must be able to set aside 5.5TB of data in order to run a full archival node for the Ethereum blockchain:
What's the Importance of Archival Nodes?
Archival nodes (full ones) are the only ones that are capable of providing a history of all the states of the blockchain (i.e., the full list of TXs that have occurred on the blockchain).
They are the comparative equivalents of a full node on other protocols (Bitcoin, Litecoin, etc.)
Knowing this, its worth asking, 'Why are the requirements for running a full node on Ethereum so high?'
If you said "smart contracts", that's a hell of a guess.
But not quite right.
The true reason why is due to the stateful nature of Ethereum vs. the stateless nature of blockchain (although both blockchains are, in many ways, stateless as it pertains to wallet generation).
Archival Nodes: Last of a Dying Breed
Not so long ago (in 2019), Ethereum underwent its 'Constantinople' upgrade.
Despite the community knowing about and being in agreement with the decision to upgrade the network weeks in advance, major entities in the Ethereum ecosystem still suffered serious connectivity issues (akin to what the Ethereum blockchain face on November 11th, 2020 when Infura.io's API endpoint went down for several hours).
Chronicling BlockCypher's Troubles With the 'Constantinople' Upgrade
On March 11th, 2019, BlockCypher published a post-mortem report explaining the cause of their blockchain connectivity issues during the upgrade.
Initial Source of Their Issues
The article states:
"The night of January 8, we realized something was wrong with our Ethereum state but we did not know what: the only thing we knew is we were getting an error that some small piece of data was missing. The Ethereum state is inscrutable — all data is hashed in a tree — and it made it impossible for us to figure out what exactly was wrong"
Their problem is perceivably complex. As mentioned in the excerpt above, any amendment, alteration or omission of pre-imaged data will result in an entirely different hash (this is the due to the 'collision-resistant' property of the hash function being leveraged by Ethereum).
Why a Full Archival Node Was Necessary
Many reading this that are familiar with the Ethereum ecosystem will insist that the use of a full archival node is entirely unnecessary, insisting that users can instead opt for a 'fast sync' method to reproduce an archival node on the Ethereum network.
An analysis of this method of erecting an archival node requires leveraging a proprietary algorithm to sync the "pruned" version of the full archival node was recently published on GitHub, which can be found here for those that are curious.
As readers will see, this is far from a trustless means of syncing the blockchain.
Specifically, the report notes:
"Compared to the classical Sybil at tack, fast sync provides such an attacker with an extra ability, that of feeding a node a view of the network that's not only different from the real network, but also that might around the EVM mechanics."
"The Ethereum protocol only validates state root hashes by processing all the transactions against the previous state root. But by skipping the transaction processing, we cannot prove that the state root contained within the fast sync pivot point is valid or not, so as long as an attacker can maintain a fake blockchain that's on par with the real network, it could create an invalid view of the network's state"
Following from the above excerpts, this means that an attacker could effectively fork the Ethereum network if they are able to effectively leverage this sybil attack against enough nodes on the Ethereum network, since the end result would be said nodes having an incompatible view of Ethereum's actual state.
Traveling only slightly further 'up the road' here, this means that an attacker's threshold for subverting 'control' over the Ethereum network is reduced to the feasibility of leveraging this Sybil attack on over 50% of the nodes (excluding full archival nodes).
Assessing the Potential Risk
To assess the risk and viability of such an attack, we must first get a sense of the network architecture itself, which means that we need to ascertain:
- How many nodes are running on the network
- The distribution of nodes per client (for reasons that will be explained further below)
Fortunately, extracting said information is relatively simple, thanks to data aggregated by, 'Etheernodes' .
As seen above, over 80% of the clients are running 'GETH', which means that an error / malfunction on the GETH client could have a destructive impact on the network (i.e., cause connectivity issues, failures to sync, chain splits, etc.)
Ironically, Ethereum's main dev team provided empirical proof that this would be the case just one day prior to the time of publication of this report (November 11th, 2020).
Evaluating the GETH Chain Split "Hark Fork" Caused On November 11th, 2020
Curiously, the post-mortem begins with the following statement:
"Yesterday - 11th November, 2020 - a consensus issue was (deliberately) triggered on the Ethereum network. Opposed to the usual way these play out however, this consensus issue was not between different clients, rather between different versions of the same client, namely Geth"
The cause of the break in consensus among GETH clients stemmed from the following:
"Geth v1.9.7 (released 7th November, 2019) broke the EIP-211 implementation, whereby a memory area was shallow-copied, allowing it to be overwritten out of bounds. "
"The bug was reported by John Youngseok Yang on the 15th July, 2020 and was silently fixed and shipped 5 days later in Geth v1.9.17 (20th July, 2020). This fix brought Geth back into consensus with Besu, Nethermind and OpenEthereum (and the Ethereum specification itself); however it broke consensus with earlier Geth releases."
Again, given the sheer number of GETH clients on the network, the end result was that the entire network suffered from a potential split even though the issue was allegedly restricted to nodes running the same sort of software.
This is enumerated in the report where it states:
"Unfortunately not all node operators were running recent releases and yesterday morning a transaction managed to trigger the consensus issue, splitting old Geth releases off from the rest of the network. This became a larger issue as Infura was one of the affected parties, hence taking with them their client base."
Infura.io, specifically, is such a large entity that its possible that most of the Ethereum ecosystem was taken offline as a result of this spontaneous hard fork.
The GETH team's explanation for pushing the Ethereum hard fork is a flimsy, at best - but that's something that's a bit outside of the scope of what's being covered here in this report (for more information on that, simply scroll to the mock 'Q&A' section of the post-mortem report).
Following the Carnage via CoinDesk Reporting
Fortunately, much of the actual disruption in the greater blockchain space was chronicled by CoinDesk on the same day as the incident as well 
The report accurately notes that the issue was initially reported as one with Infura.io specifically.
This fact alone provides an idea of just how many different pieces in the Ethereum ecosystem are reliant upon Infura.io's persistent connectivity via their APIs.
As noted in the report, critical Ethereum applications were taken offline, such as:
As well as various other exchanges.
In essence, nearly every major functioning part of the Ethereum ecosystem found itself without connectivity, entirely.
For some reason, exchanges were also reporting troubles with the Ethereum network as well (although only Binance's CEO, Changpeng Zhao was vocal about measures taken to mitigate / exacerbate any potential issues arising as a result of the chaos on the Ethereum network).
Devolving Back to Legacy Financial Problems
This, of course, left the end user in a vulnerable spot that they imagined they'd never be in while using blockchain. And that's a spot where their funds are in a wallet somewhere, yet still inaccessible due to infrastructural issues on the network.
This scenario shouldn't be entirely unfamiliar with any readers or victims of this unfortunate inconvenience, because it is an issue that customers of commercial banks have had to deal with at one point or another.
What makes this situation so unfortunately ironic is that blockchain has long since been touted as a means of escaping this very problem. That's the major reason why the trustless property is the primary feature of the blockchain overall.
Problematic Behavior By Nodes Like Infura
One of the alleged reasons for the extremely disruptive nature of the hard fork is that several of the major providers (i.e., Infura, Blockchair, etc.), were running older, non-upgraded versions of the client.
This is particularly troublesome though because there was a bug in the older client that allowed for the circumvention of EIP-211.
According to the CoinDesk article (consistent with what was reported in the post-mortem):
"Infura has fixed the issue, as have other service providers who were affected by the snafu, by updating their nodes. These stakeholders were running an older version of Geth, which contained a bug that Ethereum developers silently patched in recent update – an update which Infura and Blockchair, among others, ignored."
Which essentially means that the entire Ethereum ecosystem hinged (and more than likely still hinges) on the reliability of Infura (almost solely it seems).
And that being the case makes concepts and ideas like "decentralization", "p2p", "trustless", etc., non starters when describing the Ethereum ecosystem.
Each of those properties is individually and collectively violated by the ecosystem's current orientation. The fact that Infura was relied upon to uphold the network and update their node softwares in a timely fashion is a prime example of a trust-based relationship.
And, to reiterate, blockchain (as envisioned & designed by Satoshi Nakamoto) is a wholly trustless process (this is the primary innovation of blockchain, not 'decentralization').
Revisiting the BlockCypher Debacle
For those that remember, we covered the BlockCypher debacle, which occurred during the Constantinople upgrade that took place between January 2019 to March 2019.
We made it to the part of their post-mortem that addresses their attempt to find the 'missing part' of their sync data via the 'fast sync' feature for GETH nodes.
However, as the report notes, this attempt fell short as well, stating:
"Having failed multiple times to discover and recover the missing data, we began the ‘Fast’ sync process: it took over 2 days for a “fast” sync to complete. Unfortunately, It did not help us restore the missing data, nor did it restore our state."
Blockcypher also provided a few reasons for why the fast sync was not effective in this instance, stating the following:
"Why did a fast sync not work? Because it only includes a small subset of the whole blockchain data. To provide and operate our APIs reliably we need all of it."
"Why didn’t we make a back-up copy of our state before doing the Constantinople update? We did, but it was partially corrupted by the restore. Also the Ethereum state is not a database that can simply be backed-up and patched. It can’t be done while the Ethereum node is online, it can’t be done incrementally (and is well over a terabyte)."
That last excerpt is of particular importance because it underlines another major setback with the statefulness of the Ethereum protocol (something that Tendermint suffers from as well), and that's that the massive amount of bloat added to the chain as each state is essentially nested within the other for various moving pieces in the protocol - resulting in the 5.5+ TB chain that exists for GETH currently at the time of writing.
In particular, its also worth heeding the warning / lesson provided by Blockcypher, which is that:
"Ethereum state is very different that other blockchains. It cannot be restored using any traditional backup method."
Full Archival Node Sync Setup Became Inevitable
Since Blockcypher was unable to recover the chain's state via backup or even through a fast sync setup, they found themselves in a position where they needed to sync the full archival node, which the reports details was 2+ TB total at that point in time.
Below is another excerpt from the post-mortem report that describes the process:
"As a last resort, we began a ‘Full’ sync of the 2+Terabyte Ethereum state on January 12. Knowing the size we had to contend with, we upgraded to the biggest available machines in attempts to get the sync working faster. It barely made a difference. Compounding our problems — because there’s no transparency in the process — was the fact that we had no idea of our status in the upgrade and had no info in order to update our customers."
Sadly, it appears that their efforts were all for naught, with the reason given in the following:
"On January 14 — the day before the Constantinople HF was scheduled to take effect — it was cancelled. Apparently a security audit found a vulnerability that could allow a potential attacker to steal cryptocurrency from a smart contract. The last minute cancellation was incredibly demoralizing to us. Had we waited to implement Constantinople until AFTER it took affect, we would have saved ourselves an incredible amount of work, angst, expense….and our ETH API would have been working the entire time."
To this end, it seems clear that poor communication on the part of Ethereum's dev team(s) have led to at least two major (potentially chain-splitting) issue on the protocol, which should not be considered acceptable by any stretch for Ethereum stakeholders.
Concerning Lack of Full Archival Nodes
As BlockCypher made clear in their writeup, there appears to be no other means of accomplishing what they were trying to do other than actually sync a FULL archival node w/ the Trie state included)
Unfortunately, this mere realization was not the end of Blockcypher's woes. Having trouble with their first attempt to bring up full archival node, the team reached out to the famous Vitalik Buterin (figurehead and widely known "creator" of Ethereum) for guidance on what should be done in this instance.
They allegedly received the following response:
The obviously most concerning part of the reply that was given by Vitalik Buterin was his flippant response denoting the fact that there are hardly any other reliable stakeholders in the Ethereum ecosystem that can be relied upon to facilitate a FULL archival node sync for the Ethereum blockchain.
All of the above culminated in this statement by 'Blockcypher':
"In the event of a chain re-organization, we may be the only ones to know the entire history of Ethereum transactions"
And that's a major problem.