Table of Links Abstract and 1. Introduction Abstract and 1. Introduction Background 2.1 Ethereum Primer 2.2 Whitelisted Address Verification 2.3 Taint Analysis on Smart Contracts and 2.4 Threat Model Motivating Example and Challenges 3.1 Motivating Example 3.2 Challenges 3.3 Limitations of Existing Tools Design of AVVERIFIER and 4.1 Overview 4.2 Notations 4.3 Component#1: Code Grapher 4.4 Component#2: EVM Simulator 4.5 Component#3: Vulnerability Detector Evaluation 5.1 Experimental Setup & Research Questions 5.2 RQ1: Effectiveness & Efficiency 5.3 RQ2: Characteristics of Real-world Vulnerable Contracts 5.4 RQ3: Real-time Detection Discussion 6.1 Threats to Validity and 6.2 Limitations 6.3 Ethical Consideration Related Work Conclusion, Availability, and References Background 2.1 Ethereum Primer 2.2 Whitelisted Address Verification 2.3 Taint Analysis on Smart Contracts and 2.4 Threat Model Background 2.1 Ethereum Primer 2.1 Ethereum Primer 2.2 Whitelisted Address Verification 2.2 Whitelisted Address Verification 2.3 Taint Analysis on Smart Contracts and 2.4 Threat Model 2.3 Taint Analysis on Smart Contracts and 2.4 Threat Model Motivating Example and Challenges 3.1 Motivating Example 3.2 Challenges 3.3 Limitations of Existing Tools Motivating Example and Challenges 3.1 Motivating Example 3.1 Motivating Example 3.2 Challenges 3.2 Challenges 3.3 Limitations of Existing Tools 3.3 Limitations of Existing Tools Design of AVVERIFIER and 4.1 Overview 4.2 Notations 4.3 Component#1: Code Grapher 4.4 Component#2: EVM Simulator 4.5 Component#3: Vulnerability Detector Design of AVVERIFIER and 4.1 Overview Design of AVVERIFIER and 4.1 Overview 4.2 Notations 4.2 Notations 4.3 Component#1: Code Grapher 4.3 Component#1: Code Grapher 4.4 Component#2: EVM Simulator 4.4 Component#2: EVM Simulator 4.5 Component#3: Vulnerability Detector 4.5 Component#3: Vulnerability Detector Evaluation 5.1 Experimental Setup & Research Questions 5.2 RQ1: Effectiveness & Efficiency 5.3 RQ2: Characteristics of Real-world Vulnerable Contracts 5.4 RQ3: Real-time Detection Evaluation 5.1 Experimental Setup & Research Questions 5.1 Experimental Setup & Research Questions 5.2 RQ1: Effectiveness & Efficiency 5.2 RQ1: Effectiveness & Efficiency 5.3 RQ2: Characteristics of Real-world Vulnerable Contracts 5.3 RQ2: Characteristics of Real-world Vulnerable Contracts 5.4 RQ3: Real-time Detection 5.4 RQ3: Real-time Detection Discussion 6.1 Threats to Validity and 6.2 Limitations 6.3 Ethical Consideration Discussion 6.1 Threats to Validity and 6.2 Limitations 6.1 Threats to Validity and 6.2 Limitations 6.3 Ethical Consideration 6.3 Ethical Consideration Related Work Related Work Related Work Conclusion, Availability, and References Conclusion, Availability, and References Conclusion, Availability, and References 3 Motivating Example and Challenges 3.1 Motivating Example Listing 3 shows a smart contract owned by Visor Finance that is vulnerable to the address verification vulnerability. It was attacked on Dec. 21st, 2021 [11], causing $8.2 million financial losses. As we can see, the deposit function takes three arguments, namely, the number of tokens to be deposited (visrDeposit), the payer (from), and the beneficiary (to). From L6 to L8, it performs sanity checks, i.e., a valid amount of deposit, and valid addresses for both payer and payee. After that, it translates the deposit into shares according to totalSupply() (L11 to L14), performs the corresponding token transfer from the from address to itself (L16 to L22), and mints some vvisr tokens to the to address (L24). The vulnerability is hidden in the if code block at L16. Specifically, it allows the from address as a contract, and examines if its owner function returns the address of the transaction initiator (L17). If the assertion passes, then it invokes the delegatedTransferERC20 function defined in from. Recalling the threat model mentioned in §2.4, attackers are able to deploy contracts and initiate transactions arbitrarily. More specific, if the from is actually provided by some malicious ones, they can control the behaviors of L17 and L18. To this end, the control flow will be successfully directed to L24, where vvisr finally issues tokens to to, which is also controlled by attackers, without receiving any tokens from from that is expected by developers of Visor Finance. Through this example, we can summarize three principles related to the address verification vulnerability: P1 The vulnerable function takes an address as a parameter, and performs insufficient authorization examination on that address. Through the address, attackers can pass self-deployed and unauthorized contracts. P1 The vulnerable function takes an address as a parameter, and performs insufficient authorization examination on that address. P2 The address in P1 is taken as the target of an external call. Through the external call, the control flow is transferred to attackers. Thus, they can totally control the behavior of the external call, including the return value. P2 The address in P1 P1 is taken as the target of an external call. P3 On-chain states that are control-flow dependent on the return value mentioned in P2 are updated. To this end, through an unauthorized control flow, adversaries can get profits by indirectly manipulating on-chain states, like initiating an external call or updating balance. P3 On-chain states that are control-flow dependent on the return value mentioned in P2 are updated. P2 3.2 Challenges In response to the address verification vulnerability, as outlined in the summarized principles and the motivating example in §3.1, we identify the following challenges. C1: Lack of semantics. It is challenging to precisely identify if the address mentioned in P1 is sufficiently verified due to the lack of semantics in bytecode. According to the statistics [59], more than 99% of Ethereum contracts have not released their source code. The bytecode format is quite unreadable and contains little semantic information. Moreover, there is no debug information to assist in recovering the semantics. Traditional bytecode-based analysis tools usually require some methods to overcome this challenge, like symbolic execution [70]. C1: Lack of semantics. C1: Lack of semantics. P1 C2: Inter-procedural analysis on control flow and data flow. Detecting this vulnerability requires accurately extracting the control flow and data flow dependencies inter-procedurally. Specifically, in P2, there is an external call to an address, which is passed via the argument. Between the external call and the function entry, it will be propagated several times due to the authorization verification. Thus, we have to precisely identify if the callee address is one of the arguments through parsing data flow. Moreover, in P3, the on-chain state update depends on the return value of the external call in P2 in terms of control flow, which requires us to identify control flow dependencies among variables. In addition, from Listing 2 we can conclude that some authorizations are verified in other functions, which requires inter-procedural analysis. C2: Inter-procedural analysis on control flow and data flow. C2: Inter-procedural analysis on control flow and data flow. P2 P3 P2 Authors: (1) Tianle Sun, Huazhong University of Science and Technology; (2) Ningyu He, Peking University; (3) Jiang Xiao, Huazhong University of Science and Technology; (4) Yinliang Yue, Zhongguancun Laboratory; (5) Xiapu Luo, The Hong Kong Polytechnic University; (6) Haoyu Wang, Huazhong University of Science and Technology. Authors: Authors (1) Tianle Sun, Huazhong University of Science and Technology; (2) Ningyu He, Peking University; (3) Jiang Xiao, Huazhong University of Science and Technology; (4) Yinliang Yue, Zhongguancun Laboratory; (5) Xiapu Luo, The Hong Kong Polytechnic University; (6) Haoyu Wang, Huazhong University of Science and Technology. This paper is available on arxiv under CC BY 4.0 DEED license. This paper is available on arxiv under CC BY 4.0 DEED license. available on arxiv available on arxiv