Lightning and its fake HTLCs

Lightning is terrible but can be very good with two tweaks.

How Lightning would work without HTLCs

In a world in which HTLCs didn’t exist, Lightning channels would consist only of balances. Each commitment transaction would have two outputs: one for peer A, the other for peer B, according to the current state of the channel.

When a payment was being attempted to go through the channel, peers would just trust each other to update the state when necessary. For example:

  1. Channel AB’s balances are A[10:10]B (in sats);
  2. A sends a 3sat payment through B to C;
  3. A asks B to route the payment. Channel AB doesn’t change at all;
  4. B sends the payment to C, C accepts it;
  5. Channel BC changes from B[20:5]C to B[17:8]C;
  6. B notifies A the payment was successful, A acknowledges that;
  7. Channel AB changes from A[10:10]B to A[7:13]B.

This in the case of a success, everything is fine, no glitches, no dishonesty.

But notice that A could have refused to acknowledge that the payment went through, either because of a bug, or because it went offline forever, or because it is malicious. Then the channel AB would stay as A[10:10]B and B would have lost 3 satoshis.

How Lightning would work with HTLCs

HTLCs are introduced to remedy that situation. Now instead of commitment transactions having always only two outputs, one to each peer, now they can have HTLC outputs too. These HTLC outputs could go to either side dependending on the circumstance.

Specifically, the peer that is sending the payment can redeem the HTLC after a number of blocks have passed. The peer that is receiving the payment can redeem the HTLC if they are able to provide the preimage to the hash specified in the HTLC.

Now the flow is something like this:

  1. Channel AB’s balances are A[10:10]B;
  2. A sends a 3sat payment through B to C:
  3. A asks B to route the payment. Their channel changes to A[7:3:10]B (the middle number is the HTLC).
  4. B offers a payment to C. Their channel changes from B[20:5]C to B[17:3:5]C.
  5. C tells B the preimage for that HTLC. Their channel changes from B[17:3:5]C to B[17:8]C.
  6. B tells A the preimage for that HTLC. Their channel changes from A[7:3:10]B to A[7:13]B.

Now if A wants to trick B and stop responding B doesn’t lose money, because B knows the preimage, B just needs to publish the commitment transaction A[7:3:10]B, which gives him 10sat and then redeem the HTLC using the preimage he got from C, which gives him 3 sats more. B is fine now.

In the same way, if B stops responding for any reason, A won’t lose the money it put in that HTLC, it can publish the commitment transaction, get 7 back, then redeem the HTLC after the certain number of blocks have passed and get the other 3 sats back.

How Lightning doesn’t really work

The example above about how the HTLCs work is very elegant but has a fatal flaw on it: transaction fees. Each new HTLC added increases the size of the commitment transaction and it requires yet another transaction to be redeemed. If we consider fees of 10000 satoshis that means any HTLC below that is as if it didn’t existed because we can’t ever redeem it anyway. In fact the Lightning protocol explicitly dictates that if HTLC output amounts are below the fee necessary to redeem them they shouldn’t be created.

What happens in these cases then? Nothing, the amounts that should be in HTLCs are moved to the commitment transaction miner fee instead.

So considering a transaction fee of 10000sat for these HTLCs if one is sending Lightning payments below 10000sat that means they operate according to the unsafe protocol described in the first section above.

It is actually worse, because consider what happens in the case a channel in the middle of a route has a glitch or one of the peers is unresponsive. The other node, thinking they are operating in the trustless protocol, will proceed to publish the commitment transaction, i.e. close the channel, so they can redeem the HTLC – only then they find out they are actually in the unsafe protocol realm and there is no HTLC to be redeemed at all and they lose not only the money, but also the channel (which costed a lot of money to open and close, in overall transaction fees).

One of the biggest features of the trustless protocol are the payment proofs. Every payment is identified by a hash and whenever the payee releases the preimage relative to that hash that means the payment was complete. The incentives are in place so all nodes in the path pass the preimage back until it reaches the payer, which can then use it as the proof he has sent the payment and the payee has received it. This feature is also lost in the unsafe protocol: if a glitch happens or someone goes offline on the preimage’s way back then there is no way the preimage will reach the payer because no HTLCs are published and redeemed on the chain. The payee may have received the money but the payer will not know – but the payee will lose the money sent anyway.

The end of HTLCs

So considering the points above you may be sad because in some cases Lightning doesn’t use these magic HTLCs that give meaning to it all. But the fact is that no matter what anyone thinks, HTLCs are destined to be used less and less as time passes.

The fact that over time Bitcoin transaction fees tend to rise, and also the fact that multipart payment (MPP) are increasedly being used on Lightning for good, we can expect that soon no HTLC will ever big enough to be actually worth redeeming and we will be at a point in which not a single HTLC is real and they’re all fake.

Another thing to note is that the current unsafe protocol kicks out whenever the HTLC amount is below the Bitcoin transaction fee would be to redeem it, but this is not a reasonable algorithm. It is not reasonable to lose a channel and then pay 10000sat in fees to redeem a 10001sat HTLC. At which point does it become reasonable to do it? Probably in an amount many times above that, so it would be reasonable to even increase the threshold above which real HTLCs are made – thus making their existence more and more rare.

These are good things, because we don’t actually need HTLCs to make a functional Lightning Network.

We must embrace the unsafe protocol and make it better

So the unsafe protocol is not necessarily very bad, but the way it is being done now is, because it suffers from two big problems:

  1. Channels are lost all the time for no reason;
  2. No guarantees of the proof-of-payment ever reaching the payer exist.

The first problem we fix by just stopping the current practice of closing channels when there are no real HTLCs in them.

That, however, creates a new problem – or actually it exarcebates the second: now that we’re not closing channels, what do we do with the expired payments in them? These payments should have either been canceled or fulfilled before some block x, now we’re in block x+1, our peer has returned from its offline period and one of us will have to lose the money from that payment.

That’s fine because it’s only 3sat and it’s better to just lose 3sat than to lose both the 3sat and the channel anyway, so either one would be happy to eat the loss. Maybe we’ll even split it 50/50! No, that doesn’t work, because it creates an attack vector with peers becoming unresponsive on purpose on one side of the route and actually failing/fulfilling the payment on the other side and making a profit with that.

So we actually need to know who is to blame on these payments, even if we are not going to act on that imediatelly: we need some kind of arbiter that both peers can trust, such that if one peer is trying to send the preimage or the cancellation to the other and the other is unresponsive, when the unresponsive peer comes back, the arbiter can tell them they are to blame, so they can willfully eat the loss and the channel can continue. Both peers are happy this way.

If the unresponsive peer doesn’t accept what the arbiter says then the peer that was operating correctly can assume the unresponsive peer is malicious and close the channel, and then blacklist it and never again open a channel with a peer they know is malicious.

Again, the differences between this scheme and the current Lightning Network are that:

  1. In the current Lightning we always close channels, in this scheme we only close channels in case someone is malicious or in other worst case scenarios (the arbiter is unresponsive, for example).
  2. In the current Lightning we close the channels without having any clue on who is to blame for that, then we just proceed to reopen a channel with that same peer even in the case they were actively trying to harm us before.

What is missing? An arbiter.

The Bitcoin blockchain is the ideal arbiter, it works in the best possible way if we follow the trustless protocol, but as we’ve seen we can’t use the Bitcoin blockchain because it is expensive.

Therefore we need a new arbiter. That is the hard part, but not unsolvable. Notice that we don’t need an absolutely perfect arbiter, anything is better than nothing, really, even an unreliable arbiter that is offline half of the day is better than what we have today, or an arbiter that lies, an arbiter that charges some satoshis for each resolution, anything.

Here are some suggestions:

  • random nodes from the network selected by an algorithm that both peers agree to, so they can’t cheat by selecting themselves. The only thing these nodes have to do is to store data from one peer, try to retransmit it to the other peer and record the results for some time.
  • a set of nodes preselected by the two peers when the channel is being opened – same as above, but with more handpicked-trust involved.
  • some third-party cloud storage or notification provider with guarantees of having open data in it and some public log-keeping, like Twitter, GitHub or a Nostr relay;
  • peers that get paid to do the job, selected by the fact that they own some token (I know this is stepping too close to the shitcoin territory, but could be an idea) issued in a Spacechain;
  • a Spacechain itself, serving only as the storage for a bunch of OP_RETURNs that are published and tracked by these Lightning peers whenever there is an issue (this looks wrong, but could work).

Key points

  1. Lightning with HTLC-based routing was a cool idea, but it wasn’t ever really feasible.
  2. HTLCs are going to be abandoned and that’s the natural course of things.
  3. It is actually good that HTLCs are being abandoned, but
  4. We must change the protocol to account for the existence of fake HTLCs and thus make the bulk of the Lightning Network usage viable again.

See also