SQL Injection Strategies
上QQ阅读APP看书,第一时间看更新

An overview of SQL – a relational query language

One of the most common ways to keep data memorized in computer systems is by relying on databases. Databases can be seen as large software containers that can hold lots of information in a structured and accessible way, in order to optimize how to store data and access their operations.

Depending on the approach and model used, the way in which this is achieved can vary in terms of implementation. One of the most common ways is to use the relational model, which is based on relational algebra, for which data is a collected as a series of records that describe the relationships that exist among objects. SQL is a query language that is based on such concepts, and it is widely adopted in many database systems. This section will deal with these topics in depth by first explaining database management systems, relational databases, and SQL.

Database management systems and relational databases

The implementation of a database, as we mentioned earlier, relies on an underlying system, or a database nanagement system (DBMS). A DBMS is basically a piece of software responsible for storing, accessing, manipulating and, in general, managing data through a specific definition of the collected and managed information.

For the purpose of this book, we will now pide database systems into two large families to better understand the differences between them. We can distinguish between database models in terms of relational databases and non-relational databases due to the relevance of the relational model in data management.

Relational databases

Relational databases have been widely considered as a standard due to their many advantages. Data is collected in tables, in which rows represent objects, memorized as records, and columns represent their attributes. The name is derived from the way in which data can be correlated and connected, that is, through relations based on common attributes among tables. Thus, the concept of relational algebra becomes relevant as it describes the way in which, through a structured procedural language, data tables can be managed. SQL is the most popular representative of this model as it takes advantage of most of the concepts of relational algebra, thus providing a model that is easy to use by anyone without any coding experience, while maintaining its overall efficiency:

Figure 1.1 – A simple relational schema made up of three tables describing cars and owners, with the IDs (unique) put into a relationship

Non-relational databases

No-rel, which stands for non-relational, databases are a family of DBMS models considered as an alternative to the relational model and are usually much more prominent among database systems. Originally, the term NoSQL was used to define this family of systems, but it was considered misleading: some of the first attempts at building non-relational databases actually used some concepts of the relational model. No-rel databases include many models, some of which are as follows:

  • Network databases model the data as connected nodes in a network:

Figure 1.2 – A simple network schema to represent ownership relations between owners and cars

  • Graph-based databases highlight the connections among data using a graph-like navigable structure:

Figure 1.3 – The same ownership relation as in the relational example, this time represented in a graph-based model schema

  • Object-oriented databases model data as objects, in a similar fashion as in programming languages such as Java:

Figure 1.4 – The ownership relationship represented in an object-oriented model schema

  • Document-based databases describe data within documents containing key-value pairs, specify the way in which data is memorized and managed, and provide a flexible approach that does not rely on a defined schema. Document-based models can usually include embedded objects as collections within a single key, as shown in the following image:

Figure 1.5 – A document-based model schema that can represent the ownership relationship

Despite the name, SQL injection, in some form or another, might affect all existing database models. We will now focus on relational databases and SQL.

SQL – Structured Query Language

SQL stands for Structured Query Language, and it is the main tool used to access, navigate, and manage a relational database. SQL provides a well-structured language that is easy to understand, thanks to its natural language-like commands and the clarity of the operations it executes corresponding to specific language strings, which will be described in the following sections.

SQL has many different implementations, depending on the system it resides on, with some slight differences, some of which will be explained in the next chapter in more detail, as they are directly relevant for the SQL injection attack. Let's take a look at the most popular SQL implementations.

MySQL

MySQL is an open source version of SQL that's used in many web application frameworks and famous websites. It is considered one of the main representatives of SQL technologies, as well as an overall well-performing implementation.

MySQL is considered probably the best implementation in terms of open source SQL engines, and it is often taken as a reference for SQL syntax in general.

Here, we will list some peculiarities to remember about MySQL.

There is more than one way to insert comments in terms of character sequences:

  • #
  • /*comment*/
  • -- (This requires a blank space followed by any character in order to be interpreted as a comment. In practical tests, we use the combination -- -.)
  • ;%00 (%00 is the null character, here shown in URL encoding. This is an unofficial method for inserting comments as it's not shown in the official documentation.)
  • ` (Reverse single quote, another unofficial method.)

In general, MySQL systems have two default databases that are always present in the schema:

  • mysql (only available to privileged users)
  • information_schema (only available from MySQL version 5 onward)

MySQL supports functions and variables such as VERSION() and @@VERSION to retrieve MySQL server versioning.

SQLite

SQLite provides a different approach by presenting an implementation that is directly embedded in the application code, without the client-server architecture being used. While it is recommended for lightweight applications, such as mobile apps, it may have some shortcomings due to some intrinsic simplifications.

The main peculiarity about SQLite is that it stores information within a SQLite database file, without requiring the client-server infrastructure. Thus, being standalone, it's best not to use it for sensitive operations, such as authentication, or, in general, storing sensitive information: anyone with access to the filesystem can easily get a full view of the database.

Oracle Database

Oracle Database, often referred to as just Oracle, is one of the main proprietary SQL systems. Despite being born as a SQL relational DBMS, it started supporting different models over time. Thus, it is considered a multi-model DBMS.

In terms of proprietary database systems, Oracle is the most popular model among enterprises thanks to its wide compatibility with many technologies, programming languages, and database models.

Like MySQL, Oracle Database also has some peculiarities you need to remember in terms of database structure and syntax.

Unlike other database systems, Oracle Database supports only one way to insert comments in terms of character sequences: --.

Oracle Database systems also have two default databases:

  • SYSTEM
  • SYSAUX
Microsoft SQL Server

Microsoft SQL Server is one of the most common solutions in the enterprise world. It is a SQL DBMS optimized for running on the Windows Server OS, which is one of the most widely adopted server operating systems.

Microsoft SQL Server (MSSQL) also has its own share of peculiarities.

MSSQL supports three ways to insert comments in terms of character sequences:

  • /*comment*/
  • --
  • %00

MSSQL systems have many default databases that are always present in the schema:

  • pubs
  • model
  • msdb
  • tempdb
  • northwind
  • information_schema (from MSSQL Server 2000 onward)

MSSQL allows the use of the @@VERSION command for retrieving the database system version.

SQL is, in general, a high-performance language capable of querying structured data. Queries follow a specific readable syntax and allow users and database managers to perform various operations, ranging from creating and deleting tables to extracting data according to specific conditions. The following section focuses on showing the basic SQL syntax and capabilities, setting language implementation differences aside for the moment, while also examining the logic behind the commands mentioned.