Technical: A Brief History of Payment Channels: from Satoshi to Lightning Network
If you find WORDS helpful, Bitcoin donations are unnecessary but appreciated. Our goal is to spread and preserve Bitcoin writings for future generations. Read more. | Make a Donation |
Technical: A Brief History of Payment Channels: from Satoshi to Lightning Network
By Alan Manuel K. Gloria
July 12, 2019
Who cares about political tweets from some random countryâs president when payment channels are a much more interesting and are actually capable of carrying value?
So letâs have a short history of various payment channel techs!
Generation 0: Satoshiâs Broken nSequence Channels
Because Satoshiâs Vision included payment channels, except his implementation sucked so hard we had to go fix it and added RBF as a by-product.
Originally, the plan for nSequence
was that mempools would replace any transaction spending certain inputs with another transaction spending the same inputs, but only if the nSequence
field of the replacement was larger.
Since 0xFFFFFFFF
was the highest value that nSequence
could get, this would mark a transaction as âfinalâ and not replaceable on the mempool anymore.
In fact, this â nSequence
channelâ I will describe is the reason why we have this weird rule about nLockTime
and nSequence
. nLockTime
actually only works if nSequence
is not 0xFFFFFFFF
i.e. final. If nSequence
is 0xFFFFFFFF
then nLockTime
is ignored, because this if the âfinalâ version of the transaction.
So what youâd do would be something like this:
- You go to a bar and promise the bartender to pay by the time the bar closes. Because this is the Bitcoin universe, time is measured in blockheight, so the closing time of the bar is indicated as some future blockheight.
- For your first drink, youâd make a transaction paying to the bartender for that drink, paying from some coins you have. The transaction has an
nLockTime
equal to the closing time of the bar, and a startingnSequence
of 0. You hand over the transaction and the bartender hands you your drink. - For your succeeding drink, youâd remake the same transaction, adding the payment for that drink to the transaction output that goes to the bartender (so that output keeps getting larger, by the amount of payment), and having an
nSequence
that is one higher than the previous one. - Eventually you have to stop drinking. It comes down to one of two possibilities:
- You drink until the bar closes. Since it is now the
nLockTime
indicated in the transaction, the bartender is able to broadcast the latest transaction and tells the bouncers to kick you out of the bar. - You wisely consider the state of your liver. So you re-sign the last transaction with a âfinalâ
nSequence
of0xFFFFFFFF
i.e. the maximum possible value it can have. This allows the bartender to get his or her funds immediately (nLockTime
is ignored ifnSequence
is0xFFFFFFFF
), so he or she tells the bouncers to let you out of the bar.
- You drink until the bar closes. Since it is now the
Now that of course is a payment channel. Individual payments (purchases of alcohol, so I guess buying coffee is not in scope for payment channels). Closing is done by creating a âfinalâ transaction that is the sum of the individual payments. Sure thereâs no routing and channels are unidirectional and channels have a maximum lifetime but give Satoshi a break, he was also busy inventing Bitcoin at the time.
Now if you noticed I called this kind of payment channel âbrokenâ. This is because the mempool rules are not consensus rules, and cannot be validated (nothing about the mempool can be validated onchain: I sigh every time somebody proposes âletâs make block size dependent on mempool sizeâ, mempool state cannot be validated by onchain data). Fullnodes canât see all of the transactions you signed, and then validate that the final one with the maximum nSequence
is the one that actually is used onchain. So you can do the below:
- Become friends with Jihan Wu, because he owns >51% of the mining hashrate (he totally reorged Bitcoin to reverse the Binance hack right?).
- Slip Jihan Wu some of the more interesting drinks youâre ordering as an incentive to cooperate with you. So say you end up ordering 100 drinks, you split it with Jihan Wu and give him 50 of the drinks.
- When the bar closes, Jihan Wu quickly calls his mining rig and tells them to mine the version of your transaction with
nSequence
0. You know, that first one where you pay for only one drink. - Because fullnodes cannot validate
nSequence
, theyâll accept even thenSequence
=0 version and confirm it, immutably adding you paying for a single alcoholic drink to the blockchain. - The bartender, pissed at being cheated, takes out a shotgun from under the bar and shoots at you and Jihan Wu.
- Jihan Wu uses his mystical chi powers (actually the combined exhaust from all of his mining rigs) to slow down the shotgun pellets, making them hit you as softly as petals drifting in the wind.
- The bartender mutters some words, clothes ripping apart as he or she (hard to believe it could be a she but hey) turns into a bear, ready to maul you for cheating him or her of the payment for all the 100 drinks you ordered from him or her.
- Steely-eyed, you stand in front of the bartender-turned-bear, daring him to touch you. Youâve watched Revenant, you know Leonardo di Caprio could survive a bear mauling, and if some posh actor can survive that, you know you can too. You make a pose. âDrunken troll logic attack!â
- I think I got sidetracked here.
Lessons learned?
- Bears are bad news.
- You canât reasonably invoke âSatoshiâs Visionâ and simultaneously reject the Lightning Network because itâs not onchain. Satoshiâs Vision included a half-assed implementation of payment channels with
nSequence
, where the onchain transaction represented multiple logical payments, exactly what modern offchain techniques do (except modern offchain techniques actually work).nSequence
(the field, but not its modern meaning) has been in Bitcoin since BitCoin For Windows Alpha 0.1.0. And its original intent was payment channels. You canât get nearer to Satoshiâs Vision than being a field that Satoshi personally added to transactions on the very first public release of the BitCoin software, like srsly. - Miners can totally bypass mempool rules. In fact, the reason why
nSequence
has been repurposed to indicate âoptionalâ replace-by-fee is because miners are already incentivized by thenSequence
system to always follow replace-by-fee anyway. I mean, what do you think those drinks you passed to Jihan Wu are, other than the fee you pay him to mine a specific version of your transaction? - Satoshi made mistakes. The original design for
nSequence
is one of them. Today, we no longer usenSequence
in this way. So diverging from Satoshiâs original design is part and parcel of Bitcoin development, because over time, we learn new lessons that Satoshi never knew about. Satoshi was an important landmark in this technology. He will not be the last, or most important, that we will remember in the future: he will only be the first.
Spilman Channels
Incentive-compatible time-limited unidirectional channel; or, Satoshiâs Vision, Fixed (if transaction malleability hadnât been a problem, that is).
Now, we know the bartender will turn into a bear and maul you if you try to cheat the payment channel, and now that weâve revealed youâre good friends with Jihan Wu, the bartender will no longer accept a payment channel scheme that lets one you cooperate with a miner to cheat the bartender.
Fortunately, Jeremy Spilman proposed a better way that would not let you cheat the bartender.
First, you and the bartender perform this ritual:
- You get some funds and create a transaction that pays to a 2-of-2 multisig between you and the bartender. You donât broadcast this yet: you just sign it and get its txid.
- You create another transaction that spends the above transaction. This transaction (the âbackoffâ) has an
nLockTime
equal to the closing time of the bar, plus one block. You sign it and give this backoff transaction (but not the above transaction) to the bartender. - The bartender signs the backoff and gives it back to you. It is now valid since itâs spending a 2-of-2 of you and the bartender, and both of you have signed the backoff transaction.
- Now you broadcast the first transaction onchain. You and the bartender wait for it to be deeply confirmed, then you can start ordering.
The above is probably vaguely familiar to LN users. Itâs the funding process of payment channels! The first transaction, the one that pays to a 2-of-2 multisig, is the funding transaction that backs the payment channel funds.
So now you start ordering in this way:
- For your first drink, you create a transaction spending the funding transaction output and sending the price of the drink to the bartender, with the rest returning to you.
- You sign the transaction and pass it to the bartender, who serves your first drink.
- For your succeeding drinks, you recreate the same transaction, adding the price of the new drink to the sum that goes to the bartender and reducing the money returned to you. You sign the transaction and give it to the bartender, who serves you your next drink.
- At the end:
- If the bar closing time is reached, the bartender signs the latest transaction, completing the needed 2-of-2 signatures and broadcasting this to the Bitcoin network. Since the backoff transaction is the closing time + 1, it canât get used at closing time.
- If you decide you want to leave early because your liver is crying, you just tell the bartender to go ahead and close the channel (which the bartender can do at any time by just signing and broadcasting the latest transaction: the bartender wonât do that because he or she is hoping youâll stay and drink more).
- If you ended up just hanging around the bar and never ordering, then at closing time + 1 you broadcast the backoff transaction and get your funds back in full.
Now, even if you pass 50 drinks to Jihan Wu, you canât give him the first transaction (the one which pays for only one drink) and ask him to mine it: itâs spending a 2-of-2 and the copy you have only contains your own signature. You need the bartenderâs signature to make it valid, but he or she sure as hell isnât going to cooperate in something that would lose him or her money, so a signature from the bartender validating old state where he or she gets paid less isnât going to happen.
So, problem solved, right? Right? Okay, letâs try it. So you get your funds, put them in a funding tx, get the backoff tx, confirm the funding txâŠ
Once the funding transaction confirms deeply, the bartender laughs uproariously. He or she summons the bouncers, who surround you menacingly.
âIâm refusing service to you,â the bartender says.
âFine,â you say. âI was leaving anyway;â You smirk. âIâll get back my money with the backoff transaction, and posting about your poor service on reddit so you get negative karma, so there!â
âNot so fast,â the bartender says. His or her voice chills your bones. It looks like your exploitation of the Satoshi nSequence
payment channel is still fresh in his or her mind. âLook at the txid of the funding transaction that got confirmed.â
âWhat about it?â you ask nonchalantly, as you flip open your desktop computer and open a reputable blockchain explorer.
What you see shocks you.
âWhat the â the txid is different! Youâ you changed my signature?? But how? I put the only copy of my private key in a sealed envelope in a cast-iron box inside a safe buried in the Gobi desert protected by a clan of nomads who have dedicated their lives and their childrensâ lives to keeping my private key safe in perpetuity!â
âDidnât you know?â the bartender asks. âThe components of the signature are just very large numbers. The sign of one of the signature components can be changed, from positive to negative, or negative to positive, and the signature will remain valid. Anyone can do that, even if they donât know the private key. But because Bitcoin includes the signatures in the transaction when itâs generating the txid, this little change also changes the txid.â He or she chuckles. âThey say theyâll fix it by sep_arating the _sig natures from the transaction body. Theyâre saying that these kinds of signature malleability wonât affect transaction ids anymore after they do this, but I bet I can get my good friend Jihan Wu to delay this âSepSigâ plan for a good while yet. Friendly guy, this Jihan Wu, it turns out all I had to do was slip him 51 drinks and he was willing to mine a tx with the signature signs flipped.â His or her grin widens. âIâm afraid your backoff transaction wonât work anymore, since it spends a txid that is not existent and will never be confirmed. So hereâs the deal. You pay me 99% of the funds in the funding transaction, in exchange for me signing the transaction that spends with the txid that you see onchain. Refuse, and you lose 100% of the funds and every other HODLer, including me, benefits from the reduction in coin supply. Accept, and you get to keep 1%. I lose nothing if you refuse, so I wonât care if you do, but consider the difference of getting zilch vs. getting 1% of your funds.â His or her eyes glow. âGENUFLECT RIGHT NOW.â
Lesson learned?
- Paybackâs a bitch.
- Transaction malleability is a bitchier bitch. Itâs why we needed to fix the bug in SegWit. Sure, MtGox claimed they were attacked this way because someone kept messing with their transaction signatures and thus they lost track of where their funds went, but really, the bigger impetus for fixing transaction malleability was to support payment channels.
- Yes, including the signatures in the hash that ultimately defines the txid was a mistake. Satoshi made a lot of those. So weâre just reiterating the lesson âSatoshi was not an infinite being of infinite wisdomâ here. Satoshi just gets a pass because of how awesome Bitcoin is.
CLTV-protected Spilman Channels
Using CLTV for the backoff branch.
This variation is simply Spilman channels, but with the backoff transaction replaced with a backoff branch in the SCRIPT you pay to. It only became possible after OP_CHECKLOCKTIMEVERIFY
(CLTV) was enabled in 2015.
Now as we saw in the Spilman Channels discussion, transaction malleability means that any pre-signed offchain transaction can easily be invalidated by flipping the sign of the signature of the funding transaction while the funding transaction is not yet confirmed.
This can be avoided by simply putting any special requirements into an explicit branch of the Bitcoin SCRIPT. Now, the backoff branch is supposed to create a maximum lifetime for the payment channel, and prior to the introduction of OP_CHECKLOCKTIMEVERIFY
this could only be done by having a pre-signed nLockTime
transaction.
With CLTV, however, we can now make the branches explicit in the SCRIPT that the funding transaction pays to.
Instead of paying to a 2-of-2 in order to set up the funding transaction, you pay to a SCRIPT which is basically â2-of-2, OR this singlesig after a specified lock timeâ.
With this, there is no backoff transaction that is pre-signed and which refers to a specific txid. Instead, you can create the backoff transaction later, using whatever txid the funding transaction ends up being confirmed under. Since the funding transaction is immutable once confirmed, it is no longer possible to change the txid afterwards.
Todd Micropayment Networks
The old hub-spoke model (that isnât how LN today actually works).
One of the more direct predecessors of the Lightning Network was the hub-spoke model discussed by Peter Todd. In this model, instead of payers directly having channels to payees, payers and payees connect to a central hub server. This allows any payer to pay any payee, using the same channel for every payee on the hub. Similarly, this allows any payee to receive from any payer, using the same channel.
Remember from the above Spilman example? When you open a channel to the bartender, you have to wait around for the funding tx to confirm. This will take an hour at best. Now consider that you have to make channels for everyone you want to pay to. Thatâs not very scalable.
So the Todd hub-spoke model has a central âclearing houseâ that transport money from payers to payees. The âMoonbeamâ project takes this model. Of course, this reveals to the hub who the payer and payee are, and thus the hub can potentially censor transactions. Generally, though, it was considered that a hub would more efficiently censor by just not maintaining a channel with the payer or payee that it wants to censor (since the money it owned in the channel would just be locked uselessly if the hub wonât process payments to/from the censored user).
In any case, the ability of the central hub to monitor payments means that it can surveill the payer and payee, and then sell this private transactional data to third parties. This loss of privacy would be intolerable today.
Peter Todd also proposed that there might be multiple hubs that could transport funds to each other on behalf of their users, providing somewhat better privacy.
Another point of note is that at the time such networks were proposed, only unidirectional (Spilman) channels were available. Thus, while one could be a payer, or payee, you would have to use separate channels for your income versus for your spending. Worse, if you wanted to transfer money from your income channel to your spending channel, you had to close both and reshuffle the money between them, both onchain activities.
Poon-Dryja Lightning Network
Bidirectional two-participant channels.
The Poon-Dryja channel mechanism has two important properties:
- Bidirectional.
- No time limit.
Both the original Satoshi and the two Spilman variants are unidirectional: there is a payer and a payee, and if the payee wants to do a refund, or wants to pay for a different service or product the payer is providing, then they canât use the same unidirectional channel.
The Poon-Dryjam mechanism allows channels, however, to be bidirectional instead: you are not a payer or a payee on the channel, you can receive or send at any time as long as both you and the channel counterparty are online.
Further, unlike either of the Spilman variants, there is no time limit for the lifetime of a channel. Instead, you can keep the channel open for as long as you want.
Both properties, together, form a very powerful scaling property that I believe most people have not appreciated. With unidirectional channels, as mentioned before, if you both earn and spend over the same network of payment channels, you would have separate channels for earning and spending. You would then need to perform onchain operations to âreverseâ the directions of your channels periodically. Secondly, since Spilman channels have a fixed lifetime, even if you never used either channel, you would have to periodically ârefreshâ it by closing it and reopening.
With bidirectional, indefinite-lifetime channels, you may instead open some channels when you first begin managing your own money, then close them only after your lawyers have executed your last will and testament on how the money in your channels get divided up to your heirs: thatâs just two onchain transactions in your entire lifetime. That is the potentially very powerful scaling property that bidirectional, indefinite-lifetime channels allow.
I wonât discuss the transaction structure needed for Poon-Dryja bidirectional channels â itâs complicated and you can easily get explanations with cute graphics elsewhere.
There is a weakness of Poon-Dryja that people tend to gloss over (because it was fixed very well by /u/RustyReddit):
- You have to store all the revocation keys of a channel. This implies you are storing 1 revocation key for every channel update, so if you perform millions of updates over your entire lifetime, youâd be storing several megabytes of keys, for only a single channel. /u/RustyReddit fixed this by requiring that the revocation keys be generated from a âSeedâ revocation key, and every key is just the application of SHA256 on that key, repeatedly. For example, suppose I tell you that my first revocation key is SHA256(SHA256(seed)). You can store that in O(1) space. Then for the next revocation, I tell you SHA256(seed). From SHA256(key), you yourself can compute SHA256(SHA256(seed)) (i.e. the previous revocation key). So you can remember just the most recent revocation key, and from there youâd be able to compute every previous revocation key. When you start a channel, you perform SHA256 on your seed for several million times, then use the result as the first revocation key, removing one layer of SHA256 for every revocation key you need to generate. /u/RustyReddit not only came up with this, but also suggested an efficient O(log n) storage structure, the shachain, so that you can quickly look up any revocation key in the past in case of a breach. People no longer really talk about this O(n) revocation storage problem anymore because it was solved very very well by this mechanism.
Another thing I want to emphasize is that while the Lightning Network paper and many of the earlier presentations developed from the old Peter Todd hub-and-spoke model, the modern Lightning Network takes the logical conclusion of removing a strict separation between âhubsâ and âspokesâ. Any node on the Lightning Network can very well work as a hub for any other node. Thus, while you might operate as âmostly a payerâ, âmostly a forwarding nodeâ, âmostly a payeeâ, you still end up being at least partially a forwarding node (âhubâ) on the network, at least part of the time. This greatly reduces the problems of privacy inherent in having only a few hub nodes: forwarding nodes cannot get significantly useful data from the payments passing through them, because the distance between the payer and the payee can be so large that it would be likely that the ultimate payer and the ultimate payee could be anyone on the Lightning Network.
Lessons learned?
- We can decentralize if we try hard enough!
- âHubs badâ can be made âhubs goodâ if everybody is a hub.
- Smart people can solve problems. Itâs kinda why theyâre smart.
Future
After LN, thereâs also the Decker-Wattenhofer Duplex Micropayment Channels (DMC). This post is long enough as-is, LOL. But for now, it uses a novel âdecrementing nSequence
channelâ, using the new relative-timelock semantics of nSequence
(not the broken one originally by Satoshi). It actually uses multiple such âdecrementing nSequence
â constructs, terminating in a pair of Spilman channels, one in both directions (thus âduplexâ). Maybe Iâll discuss it some other time.
The realization that channel constructions could actually hold more channel constructions inside them (the way the Decker-Wattenhofer puts a pair of Spilman channels inside a series of âdecrementing nSequence
channelsâ) lead to the further thought behind Burchert-Decker-Wattenhofer channel factories. Basically, you could host multiple two-participant channel constructs inside a larger multiparticipant âchannelâ construct (i.e. host multiple channels inside a factory).
Further, we have the Decker-Russell-Osuntokun or âeltooâ construction. Iâd argue that this is â nSequence
done rightâ. Iâll write more about this later, because this post is long enough.
Lessons learned?
- Bitcoin offchain scaling is more powerful than you ever thought.