【正文】
ication specification. Any suspicious SQL queries that violate corresponding invariants are identified as potential attacks. Weimplement a prototype detection system SENTINEL (SEcuriNg daTabase from logIc flaws iN wEb appLication) and evaluate it using a set of realworld web applications. The experiment results demonstrate the effectiveness of our approach and acceptable performance overhead is incurred by our implementation.Categories and Subject Descriptors [Database Management]: Database Administration—Security, integrity, and protection。T website allowed an attacker to harvest the Apple iPAD subscribers’ s by enumerating ICCID numbers,which affected over 100,000 Apple customers [1]. In the Web scenario, the frontend web application acts as the single user that interacts with the database. Thus, the database fully trusts the web application, accepts and executes all the queries submitted by the application. As such,the vulnerabilities within web applications may introduce security concerns for the information stored in the database.One class of attacks exploit the application’s input validation mechanisms to tamper the intended structure of SQL queries issued by the application, which is well known as SQL injection. Another class of attacks exploit logic flaws within the application, referred to as state violation attacks[9], to trick the application into sending SQL queries at incorrect application states. For example, an attacker may retrieve other users’ account information without providing the administrator’s credential to the application. While a large body of literatures focus on fortifying theapplication’s input validation mechanisms, only a few works have attempted to address logic flaws within the web applications. Logic flaws are specific to the functionalities of web applications, thus more difficult to handle. The key to this problem is to derive the application’s intended logic (., specification) in a general and automated way. One approach to inferring the application specification is by leveraging program source code. Swaddler [9] establishes statistical models of the application state for each program block using session variables, while Waler [13] characterizes the application logic by associating valuebased invariantson function parameters and session variables with each program function. This approach is limited in that they rely on program source code to extract the specification. The inferred specification is highly dependent on how the application is structured and implemented (., the definition of a program function or block). Thus, implementation flaws may result in an inaccurate specification. Another approach infers the application specification by observing and characterizing the application’s external behavior. BLOCK [18] observes the web requests/responses between the web application and its users and extracts invariants associatedwithin. While BLOCK, as a blackbox approach, is sourcecode free, its capability is limited since it only observes web requests/responses without taking into account the large amount of information persisted in the database, resulting in an inplete specification. The persistent information in the database may affect the application’s behavior in two ways. First, the application can use persistent objects in the database for maintaining its persistent state across web sessions, while using session variables for managing the state during the session. Second, the persistent objects may embed plex data constraints for web applications. Moreover,BLOCK examines web requests/responses, thus incapable of handling certain state violation attacks that are targeted at the database. In this paper, we present a blackbox approach for automateddetection of state violation attacks with a focus on securing the backend database. To be more specific, we aim to identify and block malicious SQL queries, which are issued in a way that violates the application specification. To derive the application specification in a blackbox manner, we have to address the following two issues:(1) What external behavior to observe in orderto collect sufficient information for specification inference.Since we focus on securing the database, we observe the interaction between the web application and the database. For the application to utilize persistent objects stored in the database, they have to be returned within SQL responses first. Thus, we collect all the observed SQL queriesand responses, as well as corresponding session variables.(2) How to infer the application logic from collectedinformation in a systematic way, so that theapplication behavior can be characterized adequately.We model the web application as an extended finite state machine (EFSM). EFSM has been employed for modeling the behavior of plex software [19], since it can capture not only the state transitions but also the data constraints associated with transitions and fits well in the web application scenario. To derive the EFSM, we first construct SQL signatures from observed SQL queries, which represent the output symbols emitted from the EFSM. Then, we extract a set of invariants for each SQL signature from both session variables and SQL responses, which characterize the application state and the associated data constraints when a SQL query is issued. In particular, we leverage a wellknown technique (., daikon engine [11]) to derive valuebased invariants, including the invariants over variables that are used for indicating the application state and the data constraints that can be expressed in a mathematical relationship between variables. Besides, we extract the dependencies between SQL signatures to infer other data constraints, which are implicitly specified within previously issued SQL queries. The set of invariants, indexed by SQL signatures, manifest the application specification and are used for evaluating ining SQL quer