MySQL for Python
上QQ阅读APP看书,第一时间看更新

Forming a query in MySQL

In order to best understand how to submit a query through MySQL for Python, it is important to ensure you understand how to submit a query in MySQL itself. The similarities between the two outnumber the differences, but the first may seem confusing if you don't properly understand the second.

MySQL statements have a basic structure. In following a set structure, they are formed like natural language statements. Being a computer program, it understandably responds very poorly to informational statements and only moderately well to questions. Almost all MySQL statements have an imperatival tone, expressing your command. This is reflective of the client-server relationship. The computer is the servant who exists to do the bidding of yourself as the client or, if you prefer, master.

The syntactic structure of a simple MySQL statement is not that different from the language you use every day. Where English would have:

  • Give me everything from the staff table!

MySQL would need to hear:

  • SELECT * FROM staff;

Let's look at the MySQL statement, comparing it to the English in detail.

SELECT

MySQL does not support natural language searching like Give me. Rather, like other programming languages including Python, MySQL has a set of reserved key words. These are largely single synonyms for common, core actions. For data retrieval, the key word is SELECT. It could have been GIMME or any of a score of similar ways of saying the same thing, but MySQL is consonant with the Zen of Python:

There should be one—and preferably only one—obvious way to do it

Therefore, the MySQL developers settled on a single keyword—one that just happens to be compliant with the SQL standard.

* (asterisk)

Being read up on your regular expressions, I am sure you recognize this universal quantifier. While it is one of the most commonly used, MySQL supports several metacharacters that you can use to nuance your searches.

Note

MySQL supports different kinds of metacharacters in different contexts. The following is a full list of metacharacters. (Note that not all of them may be supported in a given situation.)

  • .: To match any single character
  • ?: To match zero or one character
  • *: To match zero or more characters
  • +: To match one or more characters
  • {n}: To match an expression n times
  • {m,n}: To match an expression a minimum of m and a maximum of n times
  • {n,}: To match an expression n or more times
  • ^: Indicates the bit-level indicator at the beginning of a line
  • $: Indicates the bit-level indicator at the end of a line
  • [[:<:]]: To match the beginning of words
  • [[:>:]]: To match the ending of words
  • [:class:]: To match a character class
  • [:alpha:]: For letters
  • [:space:]: For whitespace
  • [:punct:]: For punctuation
  • [:upper:]: For upper case letters
  • [abc]: To match one of the enclosed characters
  • [^xyz]: To match any character other than those enclosed
  • |: Separates alternatives within an expression

In the case of the SELECT command, the asterisk is the only metacharacter supported. In addition to the asterisk, however, SELECT also supports several arguments used to quantify results:

  • ALL: All matching rows (synonymous to using an asterisk (*)
  • DISTINCT: Sort the results set into unique values
  • DISTINCTROW: Where the entire record is unique

Each of these can be prefaced before the field to be quantified as illustrated here using the database structure from the last chapter:

SELECT DISTINCT id FROM menu;

This would return the values of the id column from the menu table and remove any duplicates from the results.

FROM

As with the English equivalent, MySQL needs some context in order to retrieve anything. In English, if one simply said Give me! without non-verbal cues for the intended context, the listener would rightly be confused. So here we tell MySQL from which table in the current database we want it to extract information.

Note that this is technically optional. Leaving it off, however, typically means that you are using MySQL's built-in datasets. For example, here is a statement using the built-in functions for the current date, user, and version (the \G is the same command as \g, but it tells MySQL to display the results vertically):

mysql> SELECT NOW(), USER(), VERSION()\G
*************************** 1. row ***************************
 NOW(): 2009-08-29 12:29:23
 USER(): skipper@localhost
VERSION(): 5.1.31-1ubuntu2
1 row in set (0.00 sec)

staff

This is merely the name of the table to be searched. In English, there are many locations from which one may desire something. That is why we would need to clarify that we want the items from the far table. MySQL, on the other hand, only understands things in terms of databases and tables and so understands as the name of a table whatever immediately follows the FROM keyword.

; (semicolon)

The semicolon is the default statement delimiter in MySQL. When creating a MySQL script or interacting with MySQL dynamically through its shell, leaving off a semicolon at the end of a statement will result in either a second prompt or, if you press the matter, an error being thrown. As we will see shortly, the syntax of MySQL for Python and Python itself mandates a different way of showing the end of the line. Therefore when passing MySQL queries in Python, we do not need to end any statements with the semicolon.

You may wonder why certain portions of the MySQL query are capitalized. It is a standard presentation format for MySQL statements to present the static or standard elements of a MySQL statement in capitals. Variable parts of the statement, however, are case sensitive and must be called with the same case in which they were created (otherwise, MySQL will throw an error). This matter of capitalization is not significant if you interact with MySQL directly, from a MySQL prompt. There, MySQL will understand your statements whether they be in all caps or lowercase. However, in your code, proper capitalization is critical to making your SQL statements readable to the next developer—both in Python and in MySQL scripts.

Where the semicolon is the statement delimiter in the MySQL shell, the backslash (\) is used to delimit lines within a statement. So, if you want to break up a statement but not have it executed when you press return, simply use a backslash at the end of each line of the statement. For example:

mysql> SELECT \
 -> * \
 -> FROM \
 -> menu;

Other helpful quantifiers

The previous discussion offers an overview of the SELECT command and its most common arguments. There are many other ways to nuance the data. In addition to FROM, you can also employ SELECT with the following optional arguments.

WHERE

WHERE is used to declare a condition under which MySQL is to narrow the results of the search. The basic syntax of the clause is:

[WHERE where_condition]

For example:

mysql> SELECT * FROM menu WHERE id='5';
+----+-------+-------+
| id | name | price |
+----+-------+-------+
| 5 | trout | 6.00 | 
+----+-------+-------+
1 row in set (0.00 sec)

GROUP BY

GROUP BY allows you to group results according to one of the following three parameters:

  • col_name: Is the name of one of the table's columns
  • expr: Is a regular expression
  • position: Is a position in the table

Once grouped, you can then tell MySQL to list the results in either ASCending or DESCending order through ASC and DESC, respectively. The former is the default. Additionally, MySQL provides for a summative line at the end of the results through the use of WITH ROLLUP.

The syntax of a GROUP BY clause is:

GROUP BY {col_name | expr | position} [ASC | DESC], [WITH ROLLUP]

To appreciate the effect of GROUP BY, you can retrieve all of the values from a table.

mysql> SELECT * FROM menu;
+----+---------------+-------+
| ID | NAME | PRICE |
+----+---------------+-------+
| 4 | catfish | 5.00 |
| 2 | bass | 6.75 |
| 6 | haddock | 6.50 |
| 3 | salmon | 9.50 |
| 5 | trout | 6.00 |
| 1 | tuna | 7.50 |
| 7 | yellowfin tuna | 12.00 |
+----+---------------+-------+
7 rows in set (0.00 sec)

Using GROUP BY on just one column can give us the same list in alphabetical order:

mysql> SELECT * FROM menu GROUP BY name;
+----+----------------+-------+
| id | name | price |
+----+----------------+-------+
| 2 | bass | 6.75 | 
| 4 | catfish | 5.00 | 
| 6 | haddock | 6.50 | 
| 3 | salmon | 9.50 | 
| 5 | trout | 6.00 | 
| 1 | tuna | 7.50 | 
| 7 | yellowfin tuna | 12.00 | 
+----+----------------+-------+
7 rows in set (0.00 sec)

If we had multiple entries for some of the fish (for example, tuna and yellowfin tuna), it could also be used to give a count by field value.

mysql> SELECT name, count(*) FROM menu GROUP BY name;
+----------------+----------+
| name | count(*) |
+----------------+----------+
| bass | 1 |
| catfish | 1 |
| haddock | 1 |
| salmon | 1 |
| trout | 1 |
| tuna | 2 |
| yellowfin tuna | 2 |
+----------------+----------+
7 rows in set (0.00 sec)

More on how to use the modifiers of GROUP BY can be found in Section 11.12 of the MySQL manual.

HAVING

As the WHERE clause has already been discussed, one might wonder rightly—what is the point of the HAVING clause? The WHERE clause is used for simple facts and does not support aggregate evaluations. The HAVING clause is used for aggregate functions. It can be used to replace WHERE, but to do so is generally viewed as poor coding because it violates the SQL standard.

The HAVING clause is used to quantify results according to aggregate functions. For this reason, it is usually used in conjunction with the GROUP BY clause.

The basic syntax of the HAVING clause is:

HAVING where_condition

Carrying on with the previous menu example, a basic example of this is:

mysql> SELECT * FROM menu GROUP BY name HAVING id>'3';
+----+----------------+-------+
| id | name | price |
+----+----------------+-------+
| 4 | catfish | 5.00 | 
| 6 | haddock | 6.50 | 
| 5 | trout | 6.00 | 
| 7 | yellowfin tuna | 12.00 | 
+----+----------------+-------+
4 rows in set (0.00 sec)

For an example closer to real life a video rental store that wants to know which customers rent the most videos might use a query like this one:

mysql> SELECT customer_id,count(*) AS cnt FROM rental GROUP BY customer_id HAVING cnt> 40;
+-------------+-----+
| customer_id | cnt |
+-------------+-----+
| 75 | 41 |
| 144 | 42 |
| 148 | 46 |
| 236 | 42 |
| 526 | 45 |
+-------------+-----+
5 rows in set (0.05 sec)

This shows the customer number followed by the number of total rentals in the record of rentals for each customer whose aggregate custom exceeds 40 videos.

ORDER BY

As the name implies, the ORDER BY clause is used to tell MySQL how to order the results of a query. The basic syntactical structure of this clause is as follows:

[ORDER BY {col_name | expr | position} [ASC | DESC], ...]

While the ORDER BY clause can be used in conjunction with the GROUP BY modifiers, this is typically not necessary. The following two examples illustrate why:

mysql> SELECT * FROM menu GROUP BY name ORDER BY id DESC;
+----+----------------+-------+
| id | name | price |
+----+----------------+-------+
| 7 | yellowfin tuna | 12.00 | 
| 6 | haddock | 6.50 | 
| 5 | trout | 6.00 | 
| 4 | catfish | 5.00 | 
| 3 | salmon | 9.50 | 
| 2 | bass | 6.75 | 
| 1 | tuna | 7.50 | 
+----+----------------+-------+
7 rows in set (0.00 sec)
mysql> SELECT * FROM menu ORDER BY id DESC;
+----+----------------+-------+
| id | name | price |
+----+----------------+-------+
| 7 | yellowfin tuna | 12.00 | 
| 6 | haddock | 6.50 | 
| 5 | trout | 6.00 | 
| 4 | catfish | 5.00 | 
| 3 | salmon | 9.50 | 
| 2 | bass | 6.75 | 
| 1 | tuna | 7.50 | 
+----+----------------+-------+
7 rows in set (0.00 sec)

Because the ORDER BY is applied after the GROUP BY, it largely abrogates the need for the grouping.

LIMIT

The LIMIT clause is used to restrict the number of rows that are returned in the result set. It takes two positive integers as arguments. The first number indicates the point at which to start counting and counts from zero for that process. The second number indicates how many times to increment the first number by one in order to determine the desired limit.

The syntax of the LIMIT clause is as follows:

LIMIT {[offset,] row_count | row_count OFFSET offset}

The following four examples show how LIMIT may be used to reduce the returned results neatly. Used in an iterative fashion, incrementing the parameters of a LIMIT clause allows you to step through results.

In this first example, LIMIT is applied to an alphabetic listing of fish names. The table in question is the same one we used previously for GROUP BY. Note that the id numbers are out of sequence.

mysql> SELECT * FROM menu GROUP BY name LIMIT 3,4;
+----+----------------+-------+
| id | name | price |
+----+----------------+-------+
| 3 | salmon | 9.50 | 
| 5 | trout | 6.00 | 
| 1 | tuna | 7.50 | 
| 7 | yellowfin tuna | 12.00 | 
+----+----------------+-------+
4 rows in set (0.00 sec)

In order to get the id numbers sequenced correctly, we employ an ORDER BY clause to prep the data before applying the terms of the LIMIT clause to it.

mysql> SELECT * FROM menu ORDER BY id LIMIT 3,4;
+----+----------------+-------+
| id | name | price |
+----+----------------+-------+
| 4 | catfish | 5.00 | 
| 5 | trout | 6.00 | 
| 6 | haddock | 6.50 | 
| 7 | yellowfin tuna | 12.00 | 
+----+----------------+-------+
4 rows in set (0.00 sec)

These final two examples illustrate how to apply LIMIT to searches that could easily return scores, if not hundreds or thousands, of hits.

mysql> SELECT * FROM menu ORDER BY id LIMIT 2,3;
+----+---------+-------+
| id | name | price |
+----+---------+-------+
| 3 | salmon | 9.50 | 
| 4 | catfish | 5.00 | 
| 5 | trout | 6.00 | 
+----+---------+-------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM menu LIMIT 2,4;
+----+---------+-------+
| id | name | price |
+----+---------+-------+
| 3 | salmon | 9.50 | 
| 4 | catfish | 5.00 | 
| 5 | trout | 6.00 | 
| 6 | haddock | 6.50 | 
+----+---------+-------+
4 rows in set (0.00 sec)

Note

LIMIT and HAVING may seem very similar as they both work to narrow the aggregate. The difference between them lies in the timing of their application by MySQL. HAVING is applied as a parameter of the search before MySQL actions the query. The LIMIT clause, on the other hand, is applied after the search results have been returned.

If you are programming for a web application and your database and web server are located on a single machine, you need to conserve your server resources. Therefore, you almost certainly want to use HAVING instead of LIMIT. If you are trying to reduce your search time, again, use HAVING. However, if your desired hits will comprise a sizable portion of the results otherwise, or your database server, application server, and web server are each discrete systems from each other, then you might consider using LIMIT. In the main, however, LIMIT allows MySQL to use more resources than HAVING because the former is applied after the query is already processed.

INTO OUTFILE

INTO OUTFILE allows for the rapid output of tabular results to a text file on the local host. Its basic syntax is as follows:

INTO OUTFILE 'file_name'

For example, one could use:

mysql> SELECT * FROM menu ORDER BY id LIMIT 3,4 INTO OUTFILE '/tmp/results.txt';
Query OK, 4 rows affected (0.00 sec)

This would output the results of the query to a file results.txt in the /tmp directory of the server.

More information can be found in the MySQL manual, Section 12.2.8 (URL: http://dev.mysql.com/doc/refman/5.1/en/select.html). It is understood that MySQL for Python allows a program to process the data and output the results using Python's own I/O calls.