Loading...

How to use program synthesis to secure smart contracts - Part 1/3

Code gets complex quickly, especially in the domain of smart contract programming in Solidity. In addition to the usual sources of programming complexity, Solidity has unique features that make programming in it prone to errors. Raising the risk and complexity further, smart contracts are deployed in an environment that by design does not permit fixes. When this code manages a large investment of time or money, we need assurances on the code before deployment. At Synthetic Minds, we aim for smart contracts to be as error-free as possible.


We build program synthesizers and program verifiers. The former can generate code while the latter can prove code correct. We have built a front-end for them that supports Ethereum's Solidity smart contract language. This allows us to verify the properties/business logic of smart contracts and check their every potential interaction with other contracts (that we synthesize). This way, we can take as input the code for a smart contract, explore its interaction space with other contracts, and then output the code of other contracts that could violate specified properties.


In this post, we illustrate our approach using code examples. Let's consider a simple contract:

contract Market {
   ...
        modifier onlyPaid {
                if (msg.value <= 0) revert();
                _;
        }

        modifier onlyOwner {
                if (msg.sender != owner) revert();
                _;
        }

        function Market() public {
                deployedAt = now;
                initOwner(msg.sender);
        }

        function initOwner(address _o) public {
                owner = _o;
        }

        function () {
        }

        function submit(address data) onlyPaid payable public {
                address submitBy = msg.sender;
                repository[submitBy] = data;
                submitter[data] = submitBy;
                refunds[submitBy] = msg.value;
                fundsLeft += msg.value;
                lastSubmitter = submitBy;
        }
   ...
}


The purpose of this contract is to be a "data market". Submitters can provide arbitrary data to be stored by the market. To make the market self-sustaining they initially have to pay the contract a fee to hold the data, but which will be refunded as soon as someone later asks to query the data. The data is stored in the map repository and submitter stores the provider for a data fragment with refunds holding the security deposits.


This is a toy example which keeps its data in publicly visible storage. By reading the transaction history on the blockchain, people could theoretically get the data without querying the contract with a transaction, but it suffices as a simplified illustration.


We can now use SM's verifier to verify assertions about properties that should hold across interactions between two contracts: the Market example contract and a hypothetical - machine generated - Other contract. Assertions would involve properties like "do not leak funds to arbitrary other contracts", which we can state as "Market.balance after interactions >= start of interactions". We'll call this property KEEP in the rest of this post.


This Other contract consists of a combination of function calls and parameters. We will later see how we might be able to come up with relevant code for Other.


Now if we run it with the above Market contract, our verifier comes to the conclusion that under all scenarios Market maintains KEEP.


Refunding contract users

Now let us consider if we add a simple function to Market to let Customers get refunds for the value they submitted to Market:


function get(uint _amount) public {
        require(refunds[msg.sender] >= _amount);
        msg.sender.call.value(_amount)("");
        refunds[msg.sender] -= _amount;
        fundsLeft -= _amount;
}


This seemingly innocuous addition results in the SM verifier reporting serious issues. It tells us Market is not safe under KEEP anymore. That information is useful, but not by itself actionable. In addition to the binary outcome, it also produces the following code fragment for Other (using program synthesis - see Sidebar):


contract Other {
        Market x;
        bool b;

        function Other(address _o) public {
                x = Market(_o);
        }

        function () public
        {
                if (b) {
                        b = false;
                        x.get(21);
                }
        }

        function go() public
        {
                x.submit.value(21)(this);
                x.get(21);
        }
}


If this code fragment looks too clean, that is because it is. The raw synthesized code is in the Appendix. It is trivial to add an automated cleanup phase in future versions. For the sake of exposition in this post, we will only referenced fragments in readable form such as the above, with the understanding that the raw synthesized code look like the Appendix.


And additionally produces the following trace:

Market.new
Other.new
Other.go
Market.submit
Market.get
Other.fallback
Market.get
Other.fallback
assert failure


Note that the assert failure reported here is for KEEP, and that is over the combination of Market and Other. Market does not contain an explicit reference to the KEEP assert. If can however have additional correctness asserts (aka business logic) specific to its functionality, and the analysis would consider their validity as well.


For verifying balance properties, and context and path dependent business logic assertions, the analysis needs to precisely model storage, effects of function calls and transactions, and blockchain state updates. SM's verifier is built to handle that.


This particular example demonstrates a bug type commonly known as re-entrancy, and it gets automatically rediscovered (without explicit pattern encodings) due to: 1) synthesis exploring contracts that can fit Other, 2) expressive verification that tracks storage and msgs precisely, and 3) a global assertion KEEP over the combination of Market and Other.

How does SM generate code?

To generate code, SM uses techniques from a domain called "program synthesis". Other was synthesized using that framework.


Our backend does not have any Solidity specific encodings/patterns. We compile both Market and Other into an imperative language, specifically translating blockchain specific features (now, balance, etc.), object-related constructs, and message passing into a simplified framework. The backend just understands normal control flow and attempts to find a path that invalidates the assertion.


Recently, program synthesis has attracted the attention of the AI community. We do not use deep learning. Instead, our program synthesizer comes from techniques developed in the programming languages community: formal methods based program verification, and automated theorem proving. These have been around longer and are more mature, and yield whitebox deterministic code instead of blackbox probabilistic models. A detailed description of our approach to program synthesis will be the subject of a future blog post.

Time dependent functionality

Next consider if there was some functionality that we want gated to only activate at some time in the future, and have the function do nothing until then:

function timegate() public {
        // we only continue execution at some distant time in the future
        require(now > deployedAt + 30 days);
        // Later, allow sender to get their money back
        msg.sender.transfer(refunds[msg.sender]);
        // Someone forgot to update the refund balance...
}


Here the synthesizer outputs the simple code fragment:

contract Other {
        Market x;

        function Other(address _o) public {
                x = Market(_o);
        }

        function go() public
        {
                x.submit.value(21)(this);
                x.timegate();
                x.timegate();
        }
}
with the trace:
Market.new
Other.new
Other.go
Market.submit
Market.timegate
Market.timegate
assert failure


Note again that we did not need to specify anything other than our KEEP assertion.

Ownership control

A common feature implemented in most contracts is the ability for owners to shut down the contract:

function closeDown() onlyOwner public {
        selfdestruct(owner);
}


Again, with no additional annotations (just the default KEEP assertion) SM synthesizes:

contract Other {
        Market x;

        function Other(address _o) public {
                x = Market(other);
        }

        function go() public
        {
                x.initOwner(this);
                x.closeDown();
        }
}
with the trace:
Market.new
Other.new
Other.go
Market.initOwner
Market.closeDown
assert failure


Note that there was an obvious issue with the implementation of initOwner. A good implementation would have had a onlyOwner modifier gating access, but this version does not. Running SM over such a corrected implementation, SM no longer finds an Other that violates KEEP.

Relation to known patterns

The usual approach to auditing contracts is to check for known anti-patterns, and online compilations exist of known bad patterns (e.g., Zeppelin, Trail-of-Bits, DASP, Eth-Security). Static analysis tools can check for such patterns.


SM does not check for specific patterns. SM's approach is to build an expressive verification platform, to specifically concentrate on business logic correctness, and to use synthesis as the UX to formal verification. SM should be viewed as complementary to static analysis to provide additional guarantees on your codebase. Below we reference the code patterns from DASP.co and relate them to the inferences that get drawn from first-principles in this post.


In the next post, we will continue discussing smart contract safety and also discuss additional features of SM: evaluating liveness properties that help identify denial-of-service, custom business logic checks, and low level property checks.


In the post after that, we will dive deeper in the approach SM uses for program synthesis and discuss further applications.


Table mapping against DASP.co:

get           | Re-entrancy                 | DASP.co # 1 | This post
closeMarket   | Suicide                     | DASP.co # 2 | This post
lowMath       | Arithmetic                  | DASP.co # 3 | Next post
refundPartial | Low level send              | DASP.co # 4 | Next post
makePresident | Denial of Service           | DASP.co # 5 | Next post
---- na ----  | Bad Randomness              | DASP.co # 6 | Information flow security, not checked
---- na ----  | Front running               | DASP.co # 7 | Mining layer, not checked
timeGated     | Time manipulation           | DASP.co # 8 | This post
---- na ----  | Short addresses             | DASP.co # 9 | Protocol layer or off-chain, not checked
query         | Business logic correctness  |   -- na --  | Next post

Start using SM

We are on-boarding customers onto the platform this summer.
  • Initial on-boarding: A one-time process where we make sure SM can analyze your business logic, and operates as needed over your codebase.
  • Monthly: SM's analysis will autorun on the latest version of your codebase. This will synthesize a whole spectrum of contracts, some of which might be marked as special-note. E.g., ones that cause problems and which you would want to incorporate into your test-suite for regression testing of future versions. We will make the entire suite of generated contracts available every month, as well as a detailed report summarizing the outcomes.
Email request@synthetic-minds.com to get started, or talk to us directly.






Appendix

Raw source of synthesized contract:
contract User {
    address other;
    address me = this;
    bool var_1;
    function User(address _o) public {
        other = _o;
    }
    function() public {
        if (var_1) {
            var_1 = 1 == 0;
            if (!other.call.value(0 + 1).gas(9136 + 1) (bytes4(keccak256("get(uint256)")), 3)) {
                revert();
            }
        } else {
            var_1 = 0 == 1;
        }
    }

    function run() public {
        uint start_bal;

        start_bal = this.balance;
        var_1 = 0 == 0;
        if (!other.call.value(3 + 1).gas(2 + 1) (bytes4(keccak256("submit(address)")), other)) {
            revert();
        }
        if (!other.call.value(0 + 1).gas(2720 + 1) (bytes4(keccak256("get(uint256)")), 4)) {
            revert();
        }
        assert(this.balance <= start_bal);
    }

}