\section{Introduction}\label{sec:introduction} Blockchain~\cite{nakamoto2008bitcoin} technology has come a long way in recent years, but one major issue that still persists is the lack of scalability. Blockchain systems struggle to handle the large amount of data that needs to be stored and transmitted between nodes when they are trying to synchronize with each other or bootstrap from the network. This lack of scalability limits the adoption of blockchain technology and hinders its potential to revolutionize a variety of industries. A blockchain refers to a set of ordered collection of blocks that are {\em valid} with respect to the application specifications. Note that some blockchain designs require a total order (\emph{e.g.}, Bitcoin~\cite{nakamoto2008bitcoin} or Ethereum) while some other don't (\emph{e.g.} Avalanche~\cite{rocket2020scalable}, Byteball~\cite{byteball}, Sycomore~\cite{AGLS18}). Each ordered collection shares the same starting point called {\em genesis block}. Each blockchain has a deterministic criteria to determine the {\em best} ordered collection of blocks. In the following, we will consider a totally ordered sequence of blocks, and the best collection refers to the longest chain starting from the genesis block. Blockchain data comes in two types: \textit{application data} and \textit{consensus data}. Application data includes transactions, account balances, smart contract state evolution, and everything else included in the block data itself. Consensus data includes consensus-critical information such as proof-of-work or proof-of-stake and nonces required to discover the longest chain. Everything that is part of the block header is considered consensus data. While application data can grow or shrink depending on the implementation, consensus data grows boundlessly at a constant linear rate in time~\cite{kiayias2021mining}. Recently, Kiayias~\textit{et al.}~\cite{kiayias2021mining} have proposed a blockchain protocol to reduce storage and communication complexity of PoW blockchains to $O(polylog(n))$. However, the security of their protocol was only proven if a fixed PoW difficulty is assumed for all blocks. This is not a realistic assumption in practice. For example the block difficulty in Bitcoin has shown exponential growth in the past decade. In this work, we address this important issue and present XX (un petit nom ??), a scheme to construct a succinct representation of the blockchain using Non-Interactive Proofs-of-Proof-of-Works (NIPoPoWs) that also operates in $O(\polylog(n))$ storage complexity and $O(\polylog(n))$ communication complexity and handles a variable difficulty for the blocks of the blockchain. The main idea of our construction is to XXXXXXXX % In this paper, we focus on the aforementioned protocol. % We modify it to fit a variable difficulty setting~\cite{garay2017bitcoin}, with participants joining or leaving the network. We prove that the properties needed to maintain security of the protocol still hold in a dynamic context. Our contributions are as follows: \begin{itemize} \item We propose XX, a NIPoPoW protocol that handles a variable PoW difficulty for blocks while operating in $O(\polylog(n))$ storage complexity and $O(\polylog(n))$ communication complexity; \item We present experimental results illustrating the compression of Bitcoin. % \item Study of the Mining in Logarithmic Space protocol in the variable difficulty setting. %\item Modification of the protocol to account for said dynamicity. % \item Proof of properties in the dynamic context. \end{itemize} The remaining of this paper is organised as follows. XXXX \section{Consensus and application data}\label{sec:agreement} % Sensibly the same information that is in Mining in log space. % Might add some context on dynamic environment? \subsection{Application state} Blockchain systems are designed to maintain an accurate application state. This state can be leveraged to determine important information such as who owns what and how much of it. %One of the key decisions in designing a blockchain system is determining how ownership should be represented. %There are two primary ways of doing so: UTXO-based and accounts-based systems. %In UTXO-based systems, the application state is comprised of the unspent transaction outputs that are available for spending. %On the other hand, in accounts-based systems, the application state is comprised of accounts and balances. %Notably, Bitcoin uses the UTXO-based approach to maintain its application state, while Ethereum uses an accounts-based system. % %The application state of a blockchain system is in a constant state of evolution as new transactions are applied to it. %Each transaction acts as a state evolution operator, making changes to the current state of the system. %In essence, a transaction represents a set of instructions that dictate how the application state should be transformed. %By applying a transaction to a previous application state, a new application state can be computed. %This new state reflects the changes introduced by the transaction, such as the transfer of ownership of assets or the execution of a smart contract. % Each block in a blockchain system represents a batch of transactions that are processed in a particular sequence. By processing these transactions in a specific order, the block acts as a state evolution operator that updates the overall application state of the system. There are two primary schools of thought regarding what should be stored in each block in a blockchain system. The first school advocates for storing only transactions, which represent the changes made to the application state. The application state at the end of the blockchain can then be computed by starting at the genesis application state and traversing the blockchain, applying the state evolution described by each block in order to arrive at the final application state. The second school argues for storing both transactions and the state after these transactions have been applied (called a snapshot), in each block. In such a system, the application state at the end of the blockchain does not need to be computed by applying the blocks. Instead, a block near the end of the chain can simply be inspected, and the application state within it extracted. This can result in faster queries and simpler implementation of certain types of smart contracts. However, this approach requires more storage space and can make synchronization of the network more complex. %It is possible to apply either the transaction-only or snapshot approach to either UTXO-based or accounts-based ownership representations. %Bitcoin uses the UTXO-based representation and stores only transaction deltas, while Ethereum uses an accounts-based representation and stores snapshots. %Bitcoin could potentially commit the newly computed unspent transaction output (UTXO) in every block, and some Bitcoin forks have already implemented this feature. %On the other hand, Ethereum maintains both deltas and snapshots in its blocks. %While snapshots are not necessary, they are incredibly helpful in ensuring the integrity and efficiency of the blockchain. For the rest of this paper, we work under the same assumption as Kiayias~\textit{et al.}~\cite{kiayias2021mining}, that is we assume a Proof of Work blockchain in which each block commits to an application state snapshot. %The representation of ownership is irrelevant for our purposes, and can either be UTXO-based or accounts-based. \subsection{Consensus data} Application data has the potential to increase or decrease in size. Accounts and smart contracts can be generated, removed, or modified. Conversely, consensus data expands steadily over time with the addition of blocks to the blockchain. \subsection{Bootstrapping} A \emph{verifier} is a bootstrapping node wanting to synchronize with the rest of the network, booting for the first time and holding only the genesis block. This verifier receives NIPoPoWs from nodes already part of the network, which we call \emph{provers}. We assume at least one of the provers is honest. This is a standard assumption~\cite{garay2017bitcoin,garay2015bitcoin,heilman2015eclipse,wust2016ethereum,kiayias2021mining}. Without loss of generality, this scenario can be reduced to just one honest prover and one adversarial prover. The verifier then needs to determine which NIPoPoW it receives is the right one. \section{Model} We consider a large number of parties that will locally maintain and update their blockchain. The system is considered as open, meaning that parties may join and leave the system whenever they want without any admission control. %In order to consistently update their local blockchains, participants have to agree on the next block to append to their blockchain. %Various strategies to achieve this goal exist. %In the following, we will consider Proof-of-Work~\cite{Back02hashcash} (PoW) strategy. We consider a \emph{synchronous} setting where time is quantized into discrete rounds~\cite{cryptoeprint:2014:765,10.1145/3460120.3484784}, during which every party can send a message to each of its neighbours, receive the messages sent to it during the round, and execute computational steps based on the received messages. Note that computational steps other than hashing are treated as instantaneous. %This reflects a \emph{synchronous} network. We assume the presence of a Byzantine or malicious adversary that may control strictly less than half of the total amount of computational power currently available in the system. This model, named the "Computational Threshold Adversary"~\cite{AM2017}, is an alternative to the Common Threshold Adversary Model, which bounds the total number of parties the adversary controls relative to the total population of the system. Specifically, each party is allowed to make $q$ queries to a cryptographic hash function in every round. The adversary controls up to $t$ parties. For this reason, the adversary can query the cryptographic hash function up to $t \times q$ times per round~\cite{cryptoeprint:2014:765}. We suppose that the adversary is a \textit{rushing adversary} in the sense that they can observe what the honest parties have done during the round before using their computational power at the end of the round. The adversary is also a \textit{Sybil adversary} as they can inject as many additional messages as they wish by faking multiple identities. We limit the adversary to a probabilistic polynomial-time Turing machine that behaves arbitrarily, i.e., it may not follow the prescribed algorithms. However, the adversary remains computationally bounded. Hence, it cannot, in a polynomial number of steps or time or space, forge honest parties' signatures or break the hash function and signature scheme with all but negligible probability. Therefore, we term our adversary as the \emph{1/2-bounded PPT adversary}. Any party following the prescribed protocol is called a \emph{honest} party. % \textcolor{blue}{A-t-on besoin de présenter des fonctions de hash cryptographiques ?} % Blocks are assigned a unique random identifier from a \(\ell\)-identifier space. % Identifiers are derived from the standard {\sc SHA256} hash function on the block header, \emph{i.e.,} we have \(\ell=256\). % We denote by \hash{b} the result of this function applied on block \(b\). % We assume that \hash{.} satisfies the following properties. % First, \hash{.} is a deterministic function, meaning that for a given input value it must always generate the same hash value. % Second, we assume that \hash{.} values are uniformly distributed over the \(\llbracket 0 ; 2^{\ell} -1\rrbracket \)interval. % Finally, we assume that \hash{.} is collision free in the sense that given two blocks \(b_1, b_2\) we have \(b_1 = b_2 \Leftrightarrow \) \hash{b_1} = \hash{b_2}. % \textcolor{blue}{je ne suis pas completement sûre que la suite fasse partie du modèle. En fait il faut mettre toute cette partie là où on va expliquer notre solution} % In addition to the application specifications, a block \(b\) is valid if it can be appended to a prefix of the current blockchain. % Note that a block is not required to extend the best blockchain. On the contrary, it can happen that this addition may change the best blockchain. % PoW systems rely on two additional functions, namely \diff{.} and \target{.}. % \target{.} computes a value that depends on the current best blockchain to ensure a constant interblock delay. % For instance, Bitcoin computes a new \target{.} at each sequence of 2016 blocks on the empirical interblock delay on the previous sequence. % The adjustment of \target{.} aims at handling the variation of the population of the system. % When the population growths, blocks will be generated with a smaller interblock delay, \target{.} is thus lowered by the protocol. % On the contrary, if the population decreases, the interblock delay will increase and the \target{.} has to be increased by the protocol. % \diff{b} computes a value that depends on the given block \(b\). % In such a system, a block \(b\) is {\em valid} if \(b\) meets application specification, and if \diff{b} satisfies the current interblock delay condition, \emph{i.e.}, if the following condition holds \diff{b} \(\leq\) \target{b}. % For the sake of simplicity, we consider that \diff{b} = \hash{b}. % Note that given the assumption on \hash{.} function, the distribution of \diff{b} is uniform over the \(\llbracket 0 ; 2^{\ell} -1\rrbracket \)interval, \emph{i.e.} all hash values are equiprobable. % In addition, given a \target{.} value \(t\), the probability to create a valid block \(b\) is given by \(P\{ \textnormal{b is valid} \mid T=t\} = P\{\) \diff{b} \(\leq t \mid T=t\} = t/2^{\ell}\). % In other words, \target{.} value adjustment makes valid block creation harder or easier, but given a \target{.} value all valid blocks are equiprobable. % In order to validate our assumptions on \hash{.}, and thus on \diff{.} and \target{.}, we analyze the ratio between \diff{b} and \target{b} on Bitcoin's blockchain. At the time of writing, this blockchain gather around \(785000\) blocks. Figure~\ref{fig:ratio} depicts the cumulative distribution of these ratio values in ascending order, which clearly assess the validity of uniform distribution of \hash{.} values. % \begin{figure} % \includegraphics[width=\linewidth]{figs/distribution_ratio.pdf} % \caption{\label{fig:ratio} Ratio of \diff{.} of \target{.} of Bitcoin's blockchain.} % \end{figure} % Finally, we refer by \level{.} the level function defined as follows~: % %\level{b} \(= \max_{k \leq 0}\) \diff{b} \(\in \llbracket\) \target{b} - \target{b}\(/2^k\) ; \target{b}\(\rrbracket\). % \[ % \mathfrak{level}(b) = \max_{0\leq k < \ell} \mathfrak{diff}(b) \in \llbracket \mathfrak{target}(b) - \frac{\mathfrak{target}(b)}{2^k} ; \mathfrak{target}(b) \rrbracket. % \] % Note that any valid block has a level at least \(k=0\), since we have \( \mathfrak{target}(b) - \frac{\mathfrak{target}(b)}{2^0} = 0 \leq\) \diff{b} \(\leq\) \target{b}. \section{Non-Interactive~Proofs-of-Proof-of-Works} \label{sec:kiayias} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Intuition} The proof-of-work system requires each party to generate a ``proof" of investment of a limited resource such as hash power, which takes time to generate but can be quickly verified by other parties. Every party that wants to append a block to the blockchain is required to provide a \emph{nonce} along with the contents of the block, that hashes to a value below a given target. The hash function $\mathcal{H}$ is modelled as a random oracle~\cite{random-oracle}, i.e., behaves likes an ideal random function, and produces constant length output. Since the distribution of hash values is stochastic, some blocks end up with hash values significantly below the target. \begin{definition}[$\ell$-superblock (\cite{10.1145/3460120.3484784})] A block that hashes to a value less than $T/(2^{\ell})$ is said to be a $\ell$-superblock, where $T$ is the current target value and $\ell \geq 1$. \end{definition} Note that every $\ell$-superblock is also a $\ell'$-superblock for any $\ell' \leq \ell$ and the genesis block is considered to have a hash value of $\texttt{0x00}\ldots\texttt{0}$ and hence, is a superblock of the highest level. NIPoPoWs compress a PoW-based blockchain by subsampling its blocks~\cite{10.1007/978-3-662-53357-4_5}. The working principle behind this compression lies in the assumption that a sub-sample of the blocks, i.e., the $\ell$-superblocks, can be sufficient to estimate the size of the original distribution of block headers~\cite{karantias2020compact,10.1145/3460120.3484784,10.1007/978-3-030-51280-4_27}. The key idea is to sub-sample the blocks in the blockchain such that the sub-sampled chain represents the original chain; any difference in the original blockchain results in different sub-sampled blockchains. In more details, in a long enough execution of a PoW blockchain, on average, $1/2^{\ell}$ of the blocks are $\ell$-superblocks. A NIPoPoW samples the $\ell$-superblocks to prove that the original blockchain contained $2^\ell$ blocks. In order to convince honest parties, the NIPoPoW contains a constant number $m$ of superblocks at each level (see Figure~\ref{fig:compression}). %The idea behind the NIPoPoW is that the security properties associated with the entire blockchain are also associated with superblocks at each level. Hence, the longest chain can be proven with only superblocks of the blockchain. % \ea{You should explain why a constant and known number of superblocks convinces the verifier} % The scheme requires every block header to store pointers to the last superblock at every level in order to ensure that the subsampled blocks also form a valid chain. A chain of $n$ blocks will contain superblocks at $O(\log(n))$ levels, as illustrated in Figure~\ref{fig:compression}. Hence, the space and communication complexity of NIPoPoW is $O(\polylog(n))$. The proposal by Kiayias et al.~\cite{10.1145/3460120.3484784} offers the best-known compression of PoW blockchains so far. It achieves $O(\polylog(n)c + kd + a)$ storage and communication costs while allowing parties to mine new blocks based on this compressed blockchain, where $c$ is the size of a block header, $k$ is the common prefix parameter, $d$ is the size of application data per block, and $a$ is the size of application data. % in the blockchain. \begin{figure} \centering \begin{subfigure}[b]{\linewidth} \includegraphics[width=1\linewidth]{figs/compression.drawio.pdf} \caption{Blockchain before compression, separated into a stable and unstable part.} \end{subfigure} \par\bigskip \begin{subfigure}[b]{\linewidth} \includegraphics[width=1\linewidth]{figs/compressed.drawio.pdf} \caption{Blockchain after compression. The proof $\Pi$ is composed of the stable $\pi$ and unstable $\chi$ parts.} \end{subfigure} \caption{\label{fig:compression} Kiayias~\textit{et al.}'s compression scheme~\cite{kiayias2021mining}.} \end{figure} %However, their solution reduces the security of the protocol by guaranteeing resilience to only a third Byzantine adversary. Improving these security guarantees in NIPoPoW is the primary focus of the work. \subsection{Algorithmic ingredients of the NIPoPoW} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Any scheme for operating and compressing blockchains requires to design (i) a \emph{chain compression} algorithm and (ii) a \emph{compressed chain comparison} algorithm to determine which compressed chain to be retained in the case of forks. %\begin{figure} %\centering % \begin{subfigure}{0.4\textwidth} % \includegraphics[width=\textwidth]{S&P/figures/figure-1.pdf} % \caption{The probabilistic hierarchical blockchain. Higher levels have achieved a higher difficulty during mining. All blocks are connected to the genesis block $G$.} % \label{fig:first} % \end{subfigure} % \vfill % \begin{subfigure}{0.45\textwidth} % \includegraphics[width=\textwidth]{S&P/figures/figure-2.pdf} % \caption{View of the blockchain after compression at time $t$.} % \label{fig:second} % \end{subfigure} % \vfill % \begin{subfigure}{0.45\textwidth} % \includegraphics[width=\textwidth]{S&P/figures/figure-3.pdf} % \caption{View of the same portion of the blockchain at time $t' > t$, i.e., as time elapses, only $3$-superblocks are kept among the ``old" blocks of the blockchain.} % \label{fig:third} % \end{subfigure} %\caption{Illustration of Kiayias et al.'s~\cite{10.1145/3460120.3484784} compression scheme. } %\label{fig:kiayias_diagram} %\end{figure} \subsubsection{Chain Compression Algorithm} The Kiayias et al.'s chain compression algorithm (from~\cite{10.1145/3460120.3484784}, Algorithm~\ref{alg:chaincompression}) is parameterized by a security parameter $m$ and the common prefix parameter $k$. System parameter $m$ represents the number of blocks that a party wishes to receive to feel safe. The algorithm compresses the blockchain except for the $k$ most recent blocks, called \emph{unstable} blocks. The compression works as follows: for the highest level $\ell$ that contains more than $2m$ blocks, keep all the blocks but for every level $\mu$ below $\ell$, only keep the last $2m$ blocks and all the blocks after the $m^\text{th}$ block at the $\mu+1$ level. $\Pi$ is used to represent an instance of NIPoPoW proof. %\sg{what is $\mu$ here?} %\ea{We should introduce the $\Pi$ notation here} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsubsection{Compressed Chain Comparison Algorithm} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %TODO: Give Intuition Let $\Pi_1, \Pi_2, \ldots, \Pi_n$ be the different compressed blockchains that a new party receives. The party first applies the compression algorithm to every compressed blockchain to make the comparison fair. To compare any two compressed blockchains $\Pi$ and $\Pi'$, the compression algorithm selects the minimum level $\mu$ that contains a block present in both $\Pi$ and $\Pi'$. If no such block is found, it necessarily implies that the greatest level (compression level $\ell$) in the two compressed blockchains is not the same, and thus simply, the algorithm selects the one with the greatest level. If block $b$ is found in both $\Pi$ and $\Pi'$ at the same level $\mu$, then the blockchain with the greatest number of blocks after $b$ wins the comparison. % \section{Mining in Logarithmic Space} % Prior to presenting our scheme, we briefly describe Kiayias~\textit{et al.}' solution. % Kiayias~\textit{et al.}~\cite{kiayias2021mining} present a scheme to compress a blockchain, retaining only a poly-logarithmic number of blocks. % Such a scheme requires both a compression algorithm and a compressed chain comparison algorithm. % The former compresses a chain, while the latter allows a verifier bootstrapping to determine which compressed chain it must keep. % This scheme relies on the notion of superblocks. % \begin{definition}[$\mu$-superblock] % Block satisfying the proof of work for a hash value $H(ctr||x||s) \leq \frac{T}{2^\mu}$. % \end{definition} % \subsection{Compression algorithm} % The compression algorithm is parameterized by a security (or inversely, compression) parameter $m$ and the common prefix parameter $k$~\cite{garay2015bitcoin}. % The chain is first separated into a stable and an unstable part. % The most recent $k$ blocks of the chain constitute the unstable part we call $\chi$, and set aside for now. % The stable part is then divided into levels, each level containing the set of superblocks of level $\mu$. % We keep all blocks from the highest level $\ell$ containing at least $2m$ superblocks. % For each level $\mu$ below $\ell$, we keep the last $2m$ blocks. % In addition, we keep all blocks after the $m^{th}$ block of level $\mu + 1$. % We call those blocks $\pi$. % The compressed chain $\Pi = \pi\chi$ constitutes an instance of the NIPoPoW proof. % \subsection{Comparison algorithm} %\subsection{Properties} %\section{Variable difficulty setting} \section{Properties of permissionless blockchains}\label{sec:background} We consider \emph{Proof-of-Work} based permissionless blockchains which achieve Nakamoto-based consensus~\cite{cryptoeprint:2014:765,10.5555/3002702} without relying on a trusted party by requesting the parties to contribute a limited resource such as hashing power. A robust blockchain protocol must ensure the following properties~\cite{cryptoeprint:2014:765}: %\begin{itemize} \textbf{Safety} with parameter $k\in \mathbb{N}$: If any two honest parties have a valid transaction that appears in a block that is at least $k$ blocks away from the end of their blockchain, then this transaction will appear at the same position in both blockchains, with overwhelming probability. \textbf{Liveness} with parameters $u \in \mathbb{N}$: if all honest parties try to insert a valid transaction in their blockchain for $u$ consecutive rounds, the transaction shall be accepted by any honest party by the end of the last round of the set of $u$ rounds with overwhelming probability. %\end{itemize} \section{Related Work}\label{sec:related} The problem of blockchain becoming of considerable size was initially predicted by Satoshi Nakamoto in the original paper that introduced Bitcoin~\cite{nakamoto2008bitcoin}. He offered a simple solution of a \emph{Simplified Payment Verification (SPV)} that only requires a client to store the block headers and leave out transactions. Still the amount of data that needs to be downloaded from the network grows linearly with the size of the blockchain. An alternative would be for SPV clients to embed hardcoded checkpoints but that would introduce additional trust assumptions. Flyclient~\cite{9152680} allows a succinct and secure construction of proofs in a setting with variable difficulty. They make use of Merkle mountain ranges to reference the whole previous blockchain from every block. If a full node has a proof and mines a new block on top of it, they cannot create a new proof without holding the whole chain. Thus logarithmic space mining is not possible with this scheme. CoinPrune~\cite{coinprune} still requires to store the entire chain of block header prior to to the pruning point. Another approach to build succinct proofs is to rely on SNARKS (for Succinct Non-Interactive Argument of Knowledge). Coda~\cite{coda2020} is such a construction. Coda compresses a chain to polylogarithmic size and updates the proof with new blocks. However, leveraging SNARKs requires a trusted setup for the common reference string. Kiayias et al.~\cite{10.1007/978-3-662-53357-4_5} introduced and formalized an interactive proof mechanism, \emph{Proofs-of-Proof-of-Work} (PoPoW) based on superblocks that allows a client to verify a chain in sublinear time and communication complexity. However, the authors later showed the existence of an attack on the scheme and proposed a non-interactive alternative (NIPoPoWs)~\cite{10.1145/3460120.3484784}, but the proposed solution did not address the size of the blockchain that needed to be stored by any miner. The authors further used NIPoPoWs to develop a scheme that also allows the miners to operate in $O(\polylog(n))$ storage and communication complexity while reducing the security tolerance to a Byzantine adversary that controls strictly less than a third of the total computation power and limiting itself to operate in an environment with a fixed difficulty~\cite{10.1145/3460120.3484784}. The authors in~\cite{jain2022extending} propose a scheme to construct a succinct representation of the blockchain using NIPoPoWs that also operates in $O(\polylog(n))$ storage complexity and $O(\polylog(n))$ communication complexity and which provably achieves security against a Byzantine adversary that controls strictly less than half of the total computational power. % The main idea of their solution is \emph{(i)} to attach increasing weights $W_\beta(\ell)$ to the $\ell$-superblocks where $\ell$ is above a given threshold $\beta$ (such superblocks are called \emph{$\ell$-diamond} blocks), and \emph{(ii)} to modify the chain selection rule so that the selected succinct chain is the one that accumulates the largest weight. With the modified chain selection rule, it is improbable for an attacker controlling less than half of the total hashing power to bias the honest sub-sample of the blockchain by suppressing these $\ell$-diamond blocks. The crucial point of their solution in terms of security is that an adversary cannot fake this set of diamond-blocks without actually providing work. Because the adversary has minority mining power, they cannot create a heavier sequence of diamond blocks faster than the honest parties, for the same reason that an adversary cannot create a longer regular blockchain faster than the honest parties create one. Unfortunately, this solution also considers a fixed difficulty. %Section~\ref{sec:background} talks about consensus and application data. \begin{table*} \caption{Comparison to other works.} \adjustbox{max width=\textwidth}{% \centering \begin{tabular}{cllllcccccr} \toprule & \textbf{BTC Full} & \textbf{BTC SPV} & \textbf{Ethereum} & \textbf{Superblock NIPoPows} & \textbf{FlyClient} & \textbf{Mining in Log. Space} & This work \\ \midrule \textbf{Prover storage} & $n(c+\delta)$ & $nc + log(\delta)$ & $nc + k\delta + a$ & $nc + k\delta + a$ & $nc + k\delta + a$ & $\polylog(n)c + k\delta + a$ & ? \\ \textbf{Communication} & $n(c+\delta)$ & $nc + log(\delta)$ & $nc + k\delta + a$ & $\polylog(n)c + k\delta + a$ & $\polylog(n)c + k\delta + a$ & $\polylog(n)c + k \delta + a$ & ? \\ \textbf{Can verifier mine?} & Yes & No & Yes & No & No & Yes & Yes? \\ \textbf{Works in variable difficulty?} & Yes & Yes & Yes & No & Yes & No & Yes? \\ \bottomrule \end{tabular}} \label{tab:comp} \end{table*} \section{Mining in Logarithmic Space with Variable Difficulty}\label{sec:variable} \begin{itemize} \item $k$ is not correlated to $m$ anymore. \item Link of $m$ related to $|epoch|$? \end{itemize} The protocol described in Kiayias~\textit{et al.}~\cite{kiayias2021mining} works only in a constant difficulty setting. That is, participants cannot join or leave the network, the number of participants is set from the start. This entails a number of limitation rendering the protocol inoperative in the variable difficulty setting. Firstly, the $\chi$ portion of the proof cannot be a constant number of blocks long. Indeed, a low value for the common prefix parameter $k$ means the blockchain will potentially have a non-stable tip, sacrificing persistence. A high value for $k$ means on the other hand that the tip will be too old, sacrificing liveness~\cite{zindros2020decentralized}. The value for $k$ must correspond to \textit{sufficient work having been performed}~\cite{kiayias2021mining}. The verifier must then first measure the difficulty of the network before comparing proofs. Another problem arising in the variable difficulty setting is that superblocks are only based on relative difficulties. In other words, superblocks only count the number of leading zeroes, and do not take into account the absolute difficulty of the block. This is not a problem in the constant difficulty setting, since every miner has the same target and thus the same reference point. However, this becomes a problem in the variable difficulty setting, since there is no longer a common target for all miners. This means the adversary can potentially mine on a chain with lower difficulty, obtaining rarer superblocks that on the honest chain, and ultimately being picked by the verifier over the honest chain. Finally, the previous solution's compression algorithm does not account for a varying compression parameter. Subsequent compressions with different value for $m$ can lose blocks that would have been included otherwise. Since one might need blocks that have been previously discarded, the Online property of Mining in Logarithmic Space is no longer valid in the variable difficulty setting. \subsection{Evaluating the rarity of a block} PoW systems rely on two functions, namely \diff{.} and \target{.}. Function \target{.} computes a value that depend on the current best blockchain to ensure a constant interblock delay. For instance, Bitcoin computes a new \target{.} at each sequence of 2016 blocks on the empirical interblock delay on the previous sequence. The adjustment of \target{.} output aims at handling the variation of the population of the system. In case of a population growth, blocks are generated with a smaller interblock delay, \target{.} is thus lowered by the protocol. On the contrary, if the population decreases, the interblock delay will increase and the \target{.} has to be increased by the protocol. Function \diff{b} computes a value that depend on the given block \(b\). In such a system, a block \(b\) is {\em valid} if \(b\) meets application specification, and if \diff{b} satisfies the current interblock delay condition, \emph{i.e.}, if the following condition holds \diff{b} \(\leq\) \target{b}. For the sake of simplicity, we consider that \diff{b} = \hash{b}. Note that given the assumption on \hash{.} function, the distribution of \diff{b} is uniform over the \(\llbracket 0 ; 2^{\ell} -1\rrbracket \) interval, \emph{i.e.} all hash values are equiprobable. In addition, given a \target{.} value \(t\), the probability to create a valid block \(b\) is given by \(P\{ b \textnormal{ is valid} \mid T=t\} = P\{\) \diff{b} \(\leq t \mid T=t\} = t/2^{\ell}\). In other words, \target{.} value adjustment makes valid block creation harder or easier, but given a \target{.} value all valid blocks are equiprobable. In order to validate our assumptions on \hash{.}, and thus on \diff{.} and \target{.}, we analyze the ratio between \diff{b} and \target{b} ont Bitcoin's blockchain. At the time of writing, this blockchain gather around \(785000\) blocks. Figure~\ref{fig:ratio} depicts the cumulative distribution of these ratio values in ascending order, which clearly assess the validity of uniform distribution of \hash{.} values. \begin{figure} \includegraphics[width=\linewidth]{figs/distribution_ratio.pdf} \caption{\label{fig:ratio} Ratio of \diff{.} of \target{.} of Bitcoin's blockchain.} \end{figure} Finally, we refer by \level{.} the level function defined as follows~: %\level{b} \(= \max_{k \leq 0}\) \diff{b} \(\in \llbracket\) \target{b} - \target{b}\(/2^k\) ; \target{b}\(\rrbracket\). \[ \mathfrak{level}(b) = \max_{0\leq k < \ell} \mathfrak{diff}(b) \in \llbracket \mathfrak{target}(b) - \frac{\mathfrak{target}(b)}{2^k} ; \mathfrak{target}(b) \rrbracket. \] Note that any valid block has a level at least \(k=0\), since we have \( \mathfrak{target}(b) - \frac{\mathfrak{target}(b)}{2^0} = 0 \leq\) \diff{b} \(\leq\) \target{b}. \subsection{State compression} Our state compression protocol follows the same guidelines as Kiayias~\textit{et al.}~\cite{kiayias2021mining}. \begin{itemize} \item Compress chains using \diff{.}/\target{.} ratio rarity instead of superblocks. \item Rarity introduced by sorting ratios into exponentially rarer buckets. \end{itemize} \paragraph{Notation.} In this paper, we use the notation introduced by Kiayias~\textit{et al.}~\cite{kiayias2021mining}. We note $\mathcal{C}$ an interlinked blockchain, with $\mathcal{C}[i]$ denoting it's $i^{th}$ element. $\mathcal{C}[i:j]$ represents blocks from the $i^{th}$ element inclusive to the $j^{th}$ block exclusive. $\mathcal{C}[:j]$ denotes blocks from the start up to the $j^{th}$ block exclusive, and $\mathcal{C}[i:]$ denotes blocks from the $i^{th}$ element inclusive to the end of the chain. Block indices $i$ and $j$ can be replaced by blocks $A$ and $Z$. We then write $\mathcal{C}[A:Z]$ to designate the chain from block $A$ inclusive to $Z$ exclusive. Again, any end can be omitted. A negative index means to take blocks from the end instead of from the start, thus $\mathcal{C}[-1]$ denotes the tip of $\mathcal{C}$. We write $\mathcal{C}\uparrow^\mu$ to mean the subsequence of $\mathcal{C}$ containing only its $\mu$-superblocks. The $\mathcal{C}\uparrow$ operator is absolute: $(\mathcal{C}\uparrow^\mu)\uparrow^{\mu+i} = \mathcal{C}\uparrow^{\mu+i}$. Since $\mathcal{C}$ is interlinked, $\mathcal{C}\uparrow^\mu$ is a chain too. We write $A \in \mathcal{C}$ to mean block $A$ is in the chain $\mathcal{C}$. Given two chains $\mathcal{C}_1$ and $\mathcal{C}_2$, we write $\mathcal{C}_1 \subseteq \mathcal{C}_2$ to denote all of $\mathcal{C}_1$'s blocks are in $\mathcal{C}_2$. We denote $\mathcal{C}_1 \cup \mathcal{C}_2$ the chain consisting of all blocks in either chains. $\mathcal{C}_1 \cap \mathcal{C}_2$ denotes the chain consisting of blocks only in both chains. We note $\mathcal{C}_1 \setminus \mathcal{C}_2$ the chain consisting of blocks in $\mathcal{C}_1$ but not $\mathcal{C}_2$. We extend Kiayias~\textit{et al.}'s notation to include an operator for filtering the chain according to ratio. We write $C\Uparrow^n$ to mean the subsequence of $\mathcal{C}$ containing only the blocks falling in the interval $[1 - \frac{1}{2}^n ; 1 - \frac{1}{2}^{n+1}]$. Like superblocks, the $\Uparrow$ operator is absolute, $(\mathcal{C}\Uparrow^n)\Uparrow^{n+i} = \mathcal{C}\Uparrow^{n+i}$. Since $\mathcal{C}$ is interlinked, $\mathcal{C}\Uparrow^n$ is a chain as well. The blocks must be ordered chronologically and interlink pointers checked as the union, intersection and subtraction of chains will not always result in a chain. The chain filtering operators $[\cdot]$, $\{\cdot\}$, $\uparrow$ and $\Uparrow$ have precedence over $\cup$, $\cap$ and $\setminus$. \paragraph{Compression algorithm.} Our compression algorithm \textsc{Compress}$_{m,k}(\mathcal{C})$ is given in Algorithm~\ref{alg:chaincompression}. We modify the algorithm given by Kiayias~\textit{et al.}~\cite{kiayias2021mining}, using our ratio metric instead of superblocks to represent block rarity. Given a chain $\mathcal{C}$ we want to compress, we set aside the most recent and unstable $k$ blocks in $\chi$. The remaining $\mathcal{C}[:-k]$ constitutes our stable part of the chain. \begin{algorithm} \caption{\label{alg:chaincompression}Chain compression algorithm.} \begin{algorithmic} \Function{Dissolve$_{m,k}$}{$\mathcal{C}$} \State $\mathcal{C}^* \gets \mathcal{C}[:-k]$ \State $\mathcal{D} \gets \emptyset$ \If{$|\mathcal{C}^*| \geq 2m$} \State $\ell \gets max\{n : |\mathcal{C}^* \Uparrow^n| \geq 2m\}$ \State $\mathcal{D}[\ell] \gets \mathcal{C}^*\Uparrow^\ell$ \For{$n \gets \ell-1$ down to $0$} \State $b \gets \mathcal{C}^*\Uparrow^{n+1} [-m]$ \State $\mathcal{D}[\mu] \gets \mathcal{C}^*\Uparrow^n [-2m:] \cup \mathcal{C}^*\Uparrow^n \{b:\}$ \EndFor \Else \State $\mathcal{D}[0] \gets \mathcal{C}^*$ \EndIf \State $\chi \gets \mathcal{C}[-k:]$ \State \Return $(\mathcal{D},\ell,\chi)$ \EndFunction \Function{Compress$_{m,k}$}{$\mathcal{C}$} \State $(\mathcal{D}, \ell, \chi) \gets $ \textsc{Dissolve}$_{m,k}(\mathcal{C})$ \State $\pi \gets \bigcup_{n=0}^{\ell} \mathcal{D}[n]$ \State \Return $\pi\chi$ \EndFunction \end{algorithmic} \end{algorithm} \begin{algorithm} \caption{\label{alg:statecomparison}State comparison algorithm.} \begin{algorithmic} \Function{maxvalid$_{m,k}$}{$\Pi, \Pi'$} \If{$\Pi$ is not valid} \State \Return $\Pi'$ \EndIf \If{$\Pi'$ is not valid} \State \Return $\Pi$ \EndIf \State $(\chi, \ell, \mathcal{D}) \gets $ \textsc{Dissolve}$_{m,k}(\Pi)$ \State $(\chi', \ell', \mathcal{D}') \gets $ \textsc{Dissolve}$_{m,k}(\Pi')$ \State $M \gets \{ n \in \mathbb{N} : \mathcal{D}[n] \cap \mathcal{D'}[n] \neq \emptyset \}$ \If{$M = \emptyset$} \If{$\ell' > \ell$} \State \Return $\Pi'$ \EndIf \State \Return $\Pi$ \EndIf \State $n \gets min~M$ \State $b \gets (\mathcal{D}[n] \cap \mathcal{D}'[n])[-1]$ \If{$diff\{\mathcal{D}'[n]\{b:\}\} > diff\{\mathcal{D}[n]\{b:\}\}$} \State \Return $\Pi'$ \EndIf \State \Return $\Pi$ \EndFunction \end{algorithmic} \end{algorithm} \paragraph{Synchronization} We consider a node booting for the first time, holding only the genesis block. This node is parameterized by the security parameter $m$. We call this node a verifier. The first step of the verifier is to gauge the current difficulty of the network. Indeed, the verifier needs to determine a value for $k$ to know the length of the unstable parts of the NIPoPoWs it will receive. The verifier cannot just receive this value from the provers, as there is no way it can then determine the correct value for $k$ among the ones it receives. The verifier can determine a value for $k$ by gauging the current difficulty of the network. We use the idea of a weather balloon to do so~\cite{zindros2020decentralized}. Once the verifier has determined a value for $k$, it will connect to multiple full nodes, which we call provers. We assume at least one of the provers is honest. \begin{itemize} \item Verifier bootstraps with only the genesis block $G$. \item Weather balloon to determine value for $k$. \item Apply MLS algorithm to compare proofs, heaviest chain from last common ancestor wins. \end{itemize} \begin{theorem}[Security] Whenever the verifier receives a proof $\Pi$ constructed by an honest party and a proof $\Pi'$ constructed by the adversary, it will decide in favour of the honest proof, unless the adversary is playing honestly and $\Pi'$ was generated according to the protocol. \end{theorem} \begin{proof}[Sketch] Consider the case $M \neq \emptyset$. If the comparison is performed at level $n = 0$, full proofs are compared and the heaviest chain wins. The theorem holds due to the Common Prefix property. If the comparison is performed at level $n > 0$, we apply the Common Prefix property at level $n$. By construction, there are at least $m$ blocks of level $n$ in the honest chain. As the Common Prefix property holds for $\mathcal{Q}$-blocks filtered by our ratio, the honest parties win. Consider now the case $M = \emptyset$, we can again apply the Common Prefix property for blocks at the highest level $\ell$. By construction, there are at least $2m$ blocks in the honest chain at this level. Since the adversary must achieve a level $\ell' > \ell$ to win, the adversary requires at least $2m$ blocks of level $\ell'$, which would contradict the Common Prefix property. \end{proof} \begin{theorem}[Online] Consider $\Pi = Compress_{m,k}(C)$ generated about an underlying honest chain $C$, and a block $b$ mined on top of $C$. Then $Compress_{m,k}(Cb) = Compress_{m,k}(\Pi b)$. \end{theorem} \subsection{Weather balloon} Use interactive version for now. \begin{itemize} \item Use weather balloon to count the number of mined blocks. \item Number of blocks gives us the estimated population/mining power. \item Estimation gives us a value for $k$. \end{itemize} % Theorem 2 Garay variable difficulty For the common prefix to hold, we need $k \geq \frac{\theta \gamma m}{4 \tau}$. As $m = 2016$ and $\tau = 4$, we want to compute $\theta$ and $\gamma$ in order to get $k$: \begin{itemize} % Definition 5 Garay variable difficulty \item $\theta$: we have $f(T_r^{max}(E),n_r) \leq \theta f$ (with $f = 0.03$), so $\theta \geq \frac{f(T_r^{max}(E),n_r)}{f}$. As we can estimate $q$ as $\frac{diff(S)}{rounds(S)}$, with $S$ the number of blocks confirming the weather balloon, $diff(S)$ the expected number of nonces tested to mine the $S$ blocks and $rounds(S)$ the number of rounds occurring during the mining of the $S$ blocks. Then we have $\theta \geq \frac{1 - (1 - \frac{T}{2^\kappa})^{q n_r}}{f}$ % Contrairement à ci-dessus où on prend une valeur assurant le pire cas, cela ne semble pas être le cas ci-dessous. \theta et \eta ont la même valeur... Donc il semble que l'on ai le même problème ci-dessus. Est-ce qu'avoir \theta = \eta est un problème en soit ? % Page 2 Garay variable difficulty %\item $\gammadra$: by definition of $(\gamma, s)-respecting$, we have $\gamma \geq \frac{max_{r\in S}n_r}{min_{r \in S}n_r}$ with $n_r = \frac{diff(S)f 2^\kappa}{n_0 m T_0}$ % ñ definition page 172 Zindros thesis \end{itemize} \section{Analysis}\label{sec:analysis} \section{Conclusion}\label{sec:conclusion} \appendix \section{Appendix} Paper formatted using https://github.com/acmccs/format. \begin{acks} % TODO: For the submission, don't include acknowledgments since they would most likely deanonymize you. \end{acks}