Loading...

Synthesizing contracts from tests

Overview.

In this post we show that smart contracts can be automatically synthesized from just a few tests. Our running example will be the contract for the distributed Rock Paper Scissors (RPS) game that you may have seen in our previous two posts on Design and Verification of smart contracts. There it served as a model of complexities present in real-world contracts.

Synthesizing code from tests is practically important because the typical prerequisite to synthesis is a precise specification of the desired contract behavior. Such specifications can be laborious to write, in part because they are formal, stating the behavior with pre- and postconditions. In contrast, a unit test promises to be a specification format to ordinary programmers.

Specifications are challenging also because they need to be complete in that they state all possible behaviors. We ask the programmer to provide only the tests of the typical behavior and then we automatically derive additional tests. The automatically derived tests describe possible attacks and possible strategies by cheating players. We leave the description of how to automatically generate these tests to a future blog post. In this post we show, using four examples, the power of tests to make the synthesizer produce resilient contract implementations.

We also show how the synthesized contract invokes a library, allowing the synthesized contracts to build on other modules, rather than reinventing the functionality provided by the library. The synthesizer will produce code that works within the restrictions of a given platform or a library. For example, in Move, one such restriction is that the sender of a transaction cannot withdraw funds from an arbitrary account.

This post synthesizes contracts in a hypothetical high-level language from which the final contract can be obtained with a straightforward code generation to languages such as Move or Solidity. To ensure that such code generation is possible, we add restrictions to the high-level language on what programs are allowed to do. Just like when obeying the restrictions of libraries, the synthesizer will produce a contract that permits the final code generation.

Scenarios.

Typical unit tests exercise the program on specific concrete input values. In contrast, tests used in this post use symbolic inputs. Such tests describe the desired behavior of a contract on a set of input values; for that reason we refer to them as scenarios.

Each scenario includes a sequence of calls followed by assertions that must pass at the end of the scenario. Each call is a transaction initiated by an account interacting with the contract. Assertions can refer to pre-values, which are the values at the start of the scenario. An example scenario for the RPS contract is shown here:

                    
Scenario 1 (Alice collects): Alice: rps.player1_bet(Bob, n, handA) Bob: rps.player2_bet(Alice, m, handB) Alice: rps.cash_out() Assertions: if beats(handA, handB) then // Alice beats Bob Alice.balance == pre(Alice.balance) + m Bob.balance == pre(Bob.balance) - m else if beats(handB, handA) then // Bob beats Alice Alice.balance == pre(Alice.balance) - n Bob.balance == pre(Bob.balance) + n else // a tie Alice.balance == pre(Alice.balance) Bob.balance == pre(Bob.balance)

The scenario starts with Alice challenging Bob, sending to the contract, rps, her hand (rock, paper, or scissors) along with n, the size of her bet. Bob responds to rps with his hand and m, the size of his bet. Finally, Alice calls cash_out, which ensures that funds end up with the winner. The assertions ensure that funds are transferred according to the predicate beats, which is supplied by the programmer. (We assume that m need not equal n, which makes synthesis more challenging but allows Bob to bet less than Alice, including possibly making m=0. We ignore this particular opportunity of Bob to strategize in this post.)

Our initial scenarios (provided by the programmer) include another test, called Scenario 2 (not shown), which is identical to Scenario 1, except that it is Bob who calls cash_out, not Alice. The rationale behind not having both players call cash_out is that the funds can be distributed from the contract to players whether the caller is the winner or not.

The protocol.

Besides the desired behavior, the scenario also defines the protocol of the contract. The protocol includes the set of contract methods as well as the order in which the players are to invoke the methods. In Scenarios 1 and 2, Alice invokes player1_bet; next Bob invokes player2_bet; and finally one of Alice and Bob invoke cash_out.

Synthesis 1. Cooperating Players using a Naive Money Transfer Model.

We are now ready to synthesize our first contract. We make two naive assumptions, which we remove later:

  1. Alice and Bob are cooperating players: They will follow the protocol specified in the scenarios. This assumption is encoded by asking the synthesizer to produce code that is correct on Scenarios 1 and 2, without saying anything about outcomes on other scenarios. The platform allows funds transfer between any accounts: In particular, the sender of the transaction can withdraw from an arbitrary account. This freedom is clearly undesirable but let’s see what contract the synthesizer will produce under this permissive money transfer model.

To implement the contract, the synthesizer needs to produce the code for the methods player1_bet, player2_bet, and cash_out. The synthesized methods are shown below.

The assumptions allowed the synthesizer to invent an algorithm that is interesting in two ways:

  • The contract fronts the bet for both Alice and Bob. The contract uses its own funds to deposit Alice’s bet to Bob (in the player1_bet method), and Bob’s bet to Alice (in the player2_bet method). These are the tentative winnings that either player would receive if they won the game. The contract is content to “loan” money to Alice and Bob because one of them is guaranteed to invoke cash_out (recall that they are both cooperating players following the protocol), at which point the contract has the opportunity to withdraw funds from Alice or Bob, makings things even.
  • The cash_out method is ingeniously compact. Getting the logic of cash_out right is somewhat challenging because (1) this method can be invoked by either Alice or Bob; and (2) this sender can be either a winner, or a loser, or there could be a tie. The synthesized code covers these alternatives with just two if statements. Consider the case when the sender is Alice and she lost the game. The contract will withdraw from Alice twice: first, to take back the tentative winnings that the contract deposited to Alice in player2_bet; second, to withdraw Alice’s bet that the contract already deposited to Bob in player1_bet.
    def player1_bet(sender: Account, self: RPS, other: Account, bet: int, hand: int) { // the contract (‘self’) fronts Alice’s (‘sender’s) bet, // depositing own funds to Bob (‘other’) other.deposit(bet) // store the bet info in the sender’s account sender.r = new GameState(hand,other,bet) } def player2_bet(sender: Account, self: RPS, other: Account, bet: int, hand: int) { other.deposit(bet) sender.r = new GameState(hand,other,bet) } def cash_out(sender: Account, self: RPS) { let other = sender.r.other // The contract withdraws from the losing player twice, // once to take back the money fronted in player1_bet // and player2_bet, // and the second time to pay for the fronted bet that stays // with the opposite player. // // On a tie, it takes back from both just the fronted bets. if beats(sender.r.hand, other.r.hand) other.withdraw(other.r.bet) else sender.withdraw(other.r.bet) if beats(other.r.hand, sender.r.hand) sender.withdraw(sender.r.bet) else other.withdraw(sender.r.bet) }

The synthesized code demonstrates that two scenarios are a sufficient specification to synthesize working code that can cleverly take advantage of provided assumptions on the cooperativeness of the players and the money transfer model.

To make the contract realistic, our next step restricts the naive money transfer model so that the contract cannot withdraw funds from an account that is not the sender, which it can currently do in cash_out. After that, we examine a scenario in which attackers wish to insert themselves into the Alice-Bob game.

Synthesis 2: Cooperating Players with a Realistic Money Transfer Model.

We restrict the money transfer model by allowing withdrawals only from the sender of the current transaction. We implement this restriction by simply asserting that the receiver of the method withdraw is the same as the sender, adding assert(this == sender) to method withdraw. This added assertion is sufficient to synthesize code that adjusts its logic to being able to withdraw only from the sender.

The synthesized code is below. The changes compared to the previously synthesized code are in bold.

def player1_bet(sender: Account, self: RPS, other: Account, bet: int, hand: int) { // the contract collects the bet from the sender, // keeping the funds in its account sender.withdraw(bet) sender.r = new GameState(hand,other_player,bet) } def player2_bet(sender: Account, self: RPS, other: Account, bet: int, hand: int) { sender.withdraw(bet) sender.r = new GameState(hand,other_player,bet) } def cash_out(sender: Account, self: RPS) { let other = sender.r.other if beats(sender.r.hand, other.r.hand) sender.deposit(other.r.bet) else other.deposit(other.r.bet) if beats(other.r.hand, sender.r.hand) other.deposit(sender.r.bet) else sender.deposit(sender.r.bet) }

The money transfer restriction that we imposed reflects more the Ethereum platform than the Move platform, because we are not passing around the LibraCoin, but the later could be modeled similarly.

Step 3. Avoiding an attack scenario.

We now consider a possible third-party attack. In Scenario 3, Mallory challenges Bob after Bob has entered into a game with Alice. In this scenario, Mallory selects the hand that beats Bob’s hand (by reading his hand in the chain) and cashes out the winnings. Mallory is taking advantage of a weakness of the current implementation, which assumes that only players who are in the current game will call cash_out. This assumption is correct for cooperation players but Mallory is an attacker.

Scenario 3 (Attack): Alice: rps.player1_bet(Bob, n, handA) Bob: rps.player2_bet(Alice, m, handB) Mallory: reads handB in the chain and select handM to beat it Mallory: rps.player1_bet(Bob, k, handM) Mallory: rps.cash_out() Alice: rps.cash_out() Postconditions: As in Test 1 and Test 2. Note that we are not asserting anything on Mallory’s account.

Adding this Scenario 3 to the scenarios considered by the synthesizer produces code resilient to the attack. The code, shown below, now checks whether the player cashing out is actually a participant in the game.

To thwart the attack, the synthesizer inserted a check that guards the transfer of funds. This check examines the invariant that holds in the synthesized code among the players who are participating in the game. The invariant has Alice’s other field refer to Bob and vice versa, creating a cycle. While Mallory was able to fool the contract into pointing Malloy’s other to Bob, Bob still points back at Alice, his partner in the game. Checking whether the cycle points back at Mallory (it doesn’t) thus uncovers Mallory.

def cash_out(sender: Account, self: RPS) { let other = sender.r.other let others_other = other.r.other // check if the sender is a participant in this play // by checking whether the other player points back // at the sender if sender == others_other // deposit logic the same as in Step 2 }
Step 4. Introduce the reveal method and guard against Alice not revealing.

To guard against the attack in which Bob reads Alice’s plaintext bet before submitting his own bet (read more about this attack in our previous post), we introduce into the protocol the reveal method. The updated protocol is captured by Test 4, with the default postconditions, and analogous test ends with Bob calling cash_out.

Scenario 4 (Alice must reveal her hand): Alice: rps.player1_bet(Bob, n, handA) Bob: rps.player2_bet(Alice, m, handB) Alice: rps.reveal(key) Alice: rps.cash_out()

We assume that the reveal method is written by the programmer. This is a reasonable assumption because it was the programmer who invented the protocol with reveal. The method (not shown) simply decrypts handA and stores the plaintext handA in Alice’s GameState.

The more interesting case is how we synthesize the contract that guards against Alice not revealing. See our previous post for a description of why Alice is incentivized not to reveal. This attack, by Alice, is captured by Test 5 and an analogous test where Bob calls cash_out.

Test 5 (Alice doesn’t reveal): Alice: rps.player1_bet(Bob, n, handA) Bob: rps.player2_bet(Alice, m, handB) Timer: rps.reveal_expire(Alice, Bob) Alice: rps.cash_out() Postconditions: (Alice lost by not revealing) Alice.balance == pre(Alice.balance) - n Bob.balance == pre(Bob.balance) + n

The synthesizer invents a clever strategy that keeps most of the code unchanged. The strategy adds to the game state of each player a reference to the player who will receive the winnings. Let’s call this field the recipient. The default recipient is the player her/himself; this is set up in the synthesized player1_bet and player2_bet methods. Now the trick: if Alice fails to reveal, the reveal_expire method, called by a timer service, is synthesized to set Alice’s recipient to Bob.

The code, shown below, does not change the logic of the deposit code fragment, or any other aspect of the code. Interestingly, the code works correctly even when the beats predicate receives Alice’s encrypted hand, which happens when Alice did not reveal. This is because Alice’s recipient has been changed to Bob, so regardless of the outcome of beats, all deposits will be made to Bob.

def reveal_expire(sender: Timer, self: RPS, player1: Account, player2: Account) { // Initially, the players’ r.recipient fields point to // themselves, meaning that the payout should go // to them. If Alice fails to reveal, she forfeits // her winnings (if any), to Bob. player1.r.recipient = player2 } def cash_out(sender: Account, self: RPS) { let recipient = sender.r.recipient let other = sender.r.other let others_other = other.r.other let other_recipient = other.r.recipient if sender == others_other if beats(sender.r.hand, other.r.hand) recipient.deposit(other.r.bet) else other_recipient.deposit(other.r.bet) if beats(other.r.hand, sender.r.hand) other_recipient.deposit(sender.r.bet) else recipient.deposit(sender.r.bet) }
Addition scenarios.

We have shown how adding test scenarios forces the synthesizer to produce an implementation that is correct on that scenario as well as on all previous scenarios. We have chosen the four steps because they illustrate (1) the power of scenarios to describe desired behavior and (2) the ability of the synthesizer to come up with simple, clever code.

The scenarios presented so far are by themselves insufficient to produce a correct implementation. For example, two additional scenarios to include would be a possible attack by Alice and non-responsiveness by Bob:

  • Alice reads Bob’s plaintext hand and, if she lost, she calls cash_out instead of calling reveal. On this scenario, the code synthesized in Step 4 would behave as if the game tied, returning the bets to players. This is because the predicate beats called on an encrypted hand will nearly always return false. Once this scenario is given to the synthesizer, the generated code enforces that either reveal or reveal_expire have been called before a cash_out call.
  • Bob may decide not to respond to Alice’s challenge. In this case, Alice wants her bet to be returned to her. The solution is for the contract to wait for the player2_bet call only a bounded time. The synthesized code changes the game state when Bob times out, allowing Alice to call cash_out.

Adding these scenarios will harden the code against these attacks, and adding a few more scenarios will produce a contract that is correct under all possible attacks.

We want to conclude with two questions that may be on the reader’s mind.

Where do scenarios come from?

Writing scenarios manually seem challenging because it requires thinking of possible attacks --- doesn’t this make synthesis quite laborious? To make synthesis a productivity win, we ask the programmer for only the attack-free scenarios (Scenarios 1 and 2). The trick is to derive the remaining scenarios automatically: we synthesize a candidate contract code from the current set of scenarios; if this candidate is incorrect, the verifier produces a test that shows the attack; we add this test to the current scenarios and repeat the synthesis-verification process until the candidate is approved by the verifier. Our previous post described how our verifier obtains these tests.

Your code is synthesized from tests--what correctness properties can you guarantee?

The correctness of synthesized code is guaranteed by the verifier, not by the synthesizer. If the verifier gives us scenarios for all possible attacks, then the synthesized code will be resilient under all these attacks. The power of the verifier in turn boils down to the quality of the attack model, which describes, among other things, what values the attacker can obtain by examining the chain or by frontrunning. We plan to cover the attack model in a future post.

Start using SM

Email request@synthetic-minds.com to get started.