Query Builder

Built on top of Database Access Objects, query builder allows you to construct a SQL query in a programmatic and DBMS-agnostic way. Compared to writing raw SQL statements, using query builder will help you write more readable SQL-related code and generate more secure SQL statements.

Using query builder usually involves two steps:

  1. Build a yii\db\Query object to represent different parts (e.g. SELECT, FROM) of a SELECT SQL statement.
  2. Execute a query method (e.g. all()) of yii\db\Query to retrieve data from the database.

The following code shows a typical way of using query builder:

$rows = (new \yii\db\Query())
    ->select(['id', 'email'])
    ->from('user')
    ->where(['last_name' => 'Smith'])
    ->limit(10)
    ->all();

The above code generates and executes the following SQL query, where the :last_name parameter is bound with the string 'Smith'.

SELECT `id`, `email` 
FROM `user`
WHERE `last_name` = :last_name
LIMIT 10

Info: You usually mainly work with yii\db\Query instead of yii\db\QueryBuilder. The latter is invoked by the former implicitly when you call one of the query methods. yii\db\QueryBuilder is the class responsible for generating DBMS-dependent SQL statements (e.g. quoting table/column names differently) from DBMS-independent yii\db\Query objects.

Building Queries

To build a yii\db\Query object, you call different query building methods to specify different parts of a SQL query. The names of these methods resemble the SQL keywords used in the corresponding parts of the SQL statement. For example, to specify the FROM part of a SQL query, you would call the from() method. All the query building methods return the query object itself, which allows you to chain multiple calls together.

In the following, we will describe the usage of each query building method.

select()

The select() method specifies the SELECT fragment of a SQL statement. You can specify columns to be selected in either an array or a string, like the following. The column names being selected will be automatically quoted when the SQL statement is being generated from a query object.

$query->select(['id', 'email']);

// equivalent to:

$query->select('id, email');

The column names being selected may include table prefixes and/or column aliases, like you do when writing raw SQL queries. For example,

$query->select(['user.id AS user_id', 'email']);

// equivalent to:

$query->select('user.id AS user_id, email');

If you are using the array format to specify columns, you can also use the array keys to specify the column aliases. For example, the above code can be rewritten as follows,

$query->select(['user_id' => 'user.id', 'email']);

If you do not call the select() method when building a query, * will be selected, which means selecting all columns.

Besides column names, you can also select DB expressions. You must use the array format when selecting a DB expression that contains commas to avoid incorrect automatic name quoting. For example,

$query->select(["CONCAT(first_name, ' ', last_name) AS full_name", 'email']); 

As with all places where raw SQL is involved, you may use the DBMS agnostic quoting syntax for table and column names when writing DB expressions in select.

Starting from version 2.0.1, you may also select sub-queries. You should specify each sub-query in terms of a yii\db\Query object. For example,

$subQuery = (new Query())->select('COUNT(*)')->from('user');

// SELECT `id`, (SELECT COUNT(*) FROM `user`) AS `count` FROM `post`
$query = (new Query())->select(['id', 'count' => $subQuery])->from('post');

To select distinct rows, you may call distinct(), like the following:

// SELECT DISTINCT `user_id` ...
$query->select('user_id')->distinct();

You can call addSelect() to select additional columns. For example,

$query->select(['id', 'username'])
    ->addSelect(['email']);

from()

The from() method specifies the FROM fragment of a SQL statement. For example,

// SELECT * FROM `user`
$query->from('user');

You can specify the table(s) being selected from in either a string or an array. The table names may contain schema prefixes and/or table aliases, like you do when writing raw SQL statements. For example,

$query->from(['public.user u', 'public.post p']);

// equivalent to:

$query->from('public.user u, public.post p');

If you are using the array format, you can also use the array keys to specify the table aliases, like the following:

$query->from(['u' => 'public.user', 'p' => 'public.post']);

Besides table names, you can also select from sub-queries by specifying them in terms of yii\db\Query objects. For example,

$subQuery = (new Query())->select('id')->from('user')->where('status=1');

// SELECT * FROM (SELECT `id` FROM `user` WHERE status=1) u 
$query->from(['u' => $subQuery]);

Prefixes

Also a default tablePrefix can be applied. Implementation instructions are in the "Quoting Tables" section of the "Database Access Objects" guide.

where()

The where() method specifies the WHERE fragment of a SQL query. You can use one of the three formats to specify a WHERE condition:

  • string format, e.g., 'status=1'
  • hash format, e.g. ['status' => 1, 'type' => 2]
  • operator format, e.g. ['like', 'name', 'test']

String Format

String format is best used to specify very simple conditions or if you need to use built-in functions of the DBMS. It works as if you are writing a raw SQL. For example,

$query->where('status=1');

// or use parameter binding to bind dynamic parameter values
$query->where('status=:status', [':status' => $status]);

// raw SQL using MySQL YEAR() function on a date field
$query->where('YEAR(somedate) = 2015');

Do NOT embed variables directly in the condition like the following, especially if the variable values come from end user inputs, because this will make your application subject to SQL injection attacks.

// Dangerous! Do NOT do this unless you are very certain $status must be an integer.
$query->where("status=$status");

When using parameter binding, you may call params() or addParams() to specify parameters separately.

$query->where('status=:status')
    ->addParams([':status' => $status]);

As with all places where raw SQL is involved, you may use the DBMS agnostic quoting syntax for table and column names when writing conditions in string format.

Hash Format

Hash format is best used to specify multiple AND-concatenated sub-conditions each being a simple equality assertion. It is written as an array whose keys are column names and values the corresponding values that the columns should be. For example,

// ...WHERE (`status` = 10) AND (`type` IS NULL) AND (`id` IN (4, 8, 15))
$query->where([
    'status' => 10,
    'type' => null,
    'id' => [4, 8, 15],
]);

As you can see, the query builder is intelligent enough to properly handle values that are nulls or arrays.

You can also use sub-queries with hash format like the following:

$userQuery = (new Query())->select('id')->from('user');

// ...WHERE `id` IN (SELECT `id` FROM `user`)
$query->where(['id' => $userQuery]);

Using the Hash Format, Yii internally uses parameter binding so in contrast to the string format, here you do not have to add parameters manually.

Operator Format

Operator format allows you to specify arbitrary conditions in a programmatic way. It takes the following format:

[operator, operand1, operand2, ...]

where the operands can each be specified in string format, hash format or operator format recursively, while the operator can be one of the following:

  • and: the operands should be concatenated together using AND. For example, ['and', 'id=1', 'id=2'] will generate id=1 AND id=2. If an operand is an array, it will be converted into a string using the rules described here. For example, ['and', 'type=1', ['or', 'id=1', 'id=2']] will generate type=1 AND (id=1 OR id=2). The method will NOT do any quoting or escaping.

  • or: similar to the and operator except that the operands are concatenated using OR.

  • between: operand 1 should be the column name, and operand 2 and 3 should be the starting and ending values of the range that the column is in. For example, ['between', 'id', 1, 10] will generate id BETWEEN 1 AND 10.

  • not between: similar to between except the BETWEEN is replaced with NOT BETWEEN in the generated condition.

  • in: operand 1 should be a column or DB expression. Operand 2 can be either an array or a Query object. It will generate an IN condition. If Operand 2 is an array, it will represent the range of the values that the column or DB expression should be; If Operand 2 is a Query object, a sub-query will be generated and used as the range of the column or DB expression. For example, ['in', 'id', [1, 2, 3]] will generate id IN (1, 2, 3). The method will properly quote the column name and escape values in the range. The in operator also supports composite columns. In this case, operand 1 should be an array of the columns, while operand 2 should be an array of arrays or a Query object representing the range of the columns.

  • not in: similar to the in operator except that IN is replaced with NOT IN in the generated condition.

  • like: operand 1 should be a column or DB expression, and operand 2 be a string or an array representing the values that the column or DB expression should be like. For example, ['like', 'name', 'tester'] will generate name LIKE '%tester%'. When the value range is given as an array, multiple LIKE predicates will be generated and concatenated using AND. For example, ['like', 'name', ['test', 'sample']] will generate name LIKE '%test%' AND name LIKE '%sample%'. You may also provide an optional third operand to specify how to escape special characters in the values. The operand should be an array of mappings from the special characters to their escaped counterparts. If this operand is not provided, a default escape mapping will be used. You may use false or an empty array to indicate the values are already escaped and no escape should be applied. Note that when using an escape mapping (or the third operand is not provided), the values will be automatically enclosed within a pair of percentage characters.

    Note: When using PostgreSQL you may also use ilike instead of like for case-insensitive matching.

  • or like: similar to the like operator except that OR is used to concatenate the LIKE predicates when operand 2 is an array.

  • not like: similar to the like operator except that LIKE is replaced with NOT LIKE in the generated condition.

  • or not like: similar to the not like operator except that OR is used to concatenate the NOT LIKE predicates.

  • exists: requires one operand which must be an instance of yii\db\Query representing the sub-query. It will build an EXISTS (sub-query) expression.

  • not exists: similar to the exists operator and builds a NOT EXISTS (sub-query) expression.

  • >, <=, or any other valid DB operator that takes two operands: the first operand must be a column name while the second operand a value. For example, ['>', 'age', 10] will generate age>10.

Using the Operator Format, Yii internally uses parameter binding so in contrast to the string format, here you do not have to add parameters manually.

Appending Conditions

You can use andWhere() or orWhere() to append additional conditions to an existing one. You can call them multiple times to append multiple conditions separately. For example,

$status = 10;
$search = 'yii';

$query->where(['status' => $status]);

if (!empty($search)) {
    $query->andWhere(['like', 'title', $search]);
}

If $search is not empty, the following WHERE condition will be generated:

WHERE (`status` = 10) AND (`title` LIKE '%yii%')

Filter Conditions

When building WHERE conditions based on input from end users, you usually want to ignore those input values, that are empty. For example, in a search form that allows you to search by username and email, you would like to ignore the username/email condition if the user does not enter anything in the username/email input field. You can achieve this goal by using the filterWhere() method:

// $username and $email are from user inputs
$query->filterWhere([
    'username' => $username,
    'email' => $email,
]);

The only difference between filterWhere() and where() is that the former will ignore empty values provided in the condition in hash format. So if $email is empty while $username is not, the above code will result in the SQL condition WHERE username=:username.

Info: A value is considered empty if it is null, an empty array, an empty string or a string consisting of whitespaces only.

Like andWhere() and orWhere(), you can use andFilterWhere() and orFilterWhere() to append additional filter conditions to the existing one.

Additionally, there is yii\db\Query::andFilterCompare() that can intelligently determine operator based on what's in the value:

$query->andFilterCompare('name', 'John Doe');
$query->andFilterCompare('rating', '>9');
$query->andFilterCompare('value', '<=100');

You can also specify operator explicitly:

$query->andFilterCompare('name', 'Doe', 'like');

Since Yii 2.0.11 there are similar methods for HAVING condition:

orderBy()

The orderBy() method specifies the ORDER BY fragment of a SQL query. For example,

// ... ORDER BY `id` ASC, `name` DESC
$query->orderBy([
    'id' => SORT_ASC,
    'name' => SORT_DESC,
]);

In the above code, the array keys are column names while the array values are the corresponding order by directions. The PHP constant SORT_ASC specifies ascending sort and SORT_DESC descending sort.

If ORDER BY only involves simple column names, you can specify it using a string, just like you do when writing raw SQL statements. For example,

$query->orderBy('id ASC, name DESC');

Note: You should use the array format if ORDER BY involves some DB expression.

You can call addOrderBy() to add additional columns to the ORDER BY fragment. For example,

$query->orderBy('id ASC')
    ->addOrderBy('name DESC');

groupBy()

The groupBy() method specifies the GROUP BY fragment of a SQL query. For example,

// ... GROUP BY `id`, `status`
$query->groupBy(['id', 'status']);

If GROUP BY only involves simple column names, you can specify it using a string, just like you do when writing raw SQL statements. For example,

$query->groupBy('id, status');

Note: You should use the array format if GROUP BY involves some DB expression.

You can call addGroupBy() to add additional columns to the GROUP BY fragment. For example,

$query->groupBy(['id', 'status'])
    ->addGroupBy('age');

having()

The having() method specifies the HAVING fragment of a SQL query. It takes a condition which can be specified in the same way as that for where(). For example,

// ... HAVING `status` = 1
$query->having(['status' => 1]);

Please refer to the documentation for where() for more details about how to specify a condition.

You can call andHaving() or orHaving() to append additional conditions to the HAVING fragment. For example,

// ... HAVING (`status` = 1) AND (`age` > 30)
$query->having(['status' => 1])
    ->andHaving(['>', 'age', 30]);

limit() and offset()

The limit() and offset() methods specify the LIMIT and OFFSET fragments of a SQL query. For example,

// ... LIMIT 10 OFFSET 20
$query->limit(10)->offset(20);

If you specify an invalid limit or offset (e.g. a negative value), it will be ignored.

Info: For DBMS that do not support LIMIT and OFFSET (e.g. MSSQL), query builder will generate a SQL statement that emulates the LIMIT/OFFSET behavior.

join()

The join() method specifies the JOIN fragment of a SQL query. For example,

// ... LEFT JOIN `post` ON `post`.`user_id` = `user`.`id`
$query->join('LEFT JOIN', 'post', 'post.user_id = user.id');

The join() method takes four parameters:

  • $type: join type, e.g., 'INNER JOIN', 'LEFT JOIN'.
  • $table: the name of the table to be joined.
  • $on: optional, the join condition, i.e., the ON fragment. Please refer to where() for details about specifying a condition. Note, that the array syntax does not work for specifying a column based condition, e.g. ['user.id' => 'comment.userId'] will result in a condition where the user id must be equal to the string 'comment.userId'. You should use the string syntax instead and specify the condition as 'user.id = comment.userId'.
  • $params: optional, the parameters to be bound to the join condition.

You can use the following shortcut methods to specify INNER JOIN, LEFT JOIN and RIGHT JOIN, respectively.

For example,

$query->leftJoin('post', 'post.user_id = user.id');

To join with multiple tables, call the above join methods multiple times, once for each table.

Besides joining with tables, you can also join with sub-queries. To do so, specify the sub-queries to be joined as yii\db\Query objects. For example,

$subQuery = (new \yii\db\Query())->from('post');
$query->leftJoin(['u' => $subQuery], 'u.id = author_id');

In this case, you should put the sub-query in an array and use the array key to specify the alias.

union()

The union() method specifies the UNION fragment of a SQL query. For example,

$query1 = (new \yii\db\Query())
    ->select("id, category_id AS type, name")
    ->from('post')
    ->limit(10);

$query2 = (new \yii\db\Query())
    ->select('id, type, name')
    ->from('user')
    ->limit(10);

$query1->union($query2);

You can call union() multiple times to append more UNION fragments.

Query Methods

yii\db\Query provides a whole set of methods for different query purposes:

  • all(): returns an array of rows with each row being an associative array of name-value pairs.
  • one(): returns the first row of the result.
  • column(): returns the first column of the result.
  • scalar(): returns a scalar value located at the first row and first column of the result.
  • exists(): returns a value indicating whether the query contains any result.
  • count(): returns the result of a COUNT query.
  • Other aggregation query methods, including sum($q), average($q), max($q), min($q). The $q parameter is mandatory for these methods and can be either a column name or a DB expression.

For example,

// SELECT `id`, `email` FROM `user`
$rows = (new \yii\db\Query())
    ->select(['id', 'email'])
    ->from('user')
    ->all();
    
// SELECT * FROM `user` WHERE `username` LIKE `%test%`
$row = (new \yii\db\Query())
    ->from('user')
    ->where(['like', 'username', 'test'])
    ->one();

Note: The one() method only returns the first row of the query result. It does NOT add LIMIT 1 to the generated SQL statement. This is fine and preferred if you know the query will return only one or a few rows of data (e.g. if you are querying with some primary keys). However, if the query may potentially result in many rows of data, you should call limit(1) explicitly to improve the performance, e.g., (new \yii\db\Query())->from('user')->limit(1)->one().

All these query methods take an optional $db parameter representing the DB connection that should be used to perform a DB query. If you omit this parameter, the db application component will be used as the DB connection. Below is another example using the count() query method:

// executes SQL: SELECT COUNT(*) FROM `user` WHERE `last_name`=:last_name
$count = (new \yii\db\Query())
    ->from('user')
    ->where(['last_name' => 'Smith'])
    ->count();

When you call a query method of yii\db\Query, it actually does the following work internally:

Sometimes, you may want to examine or use the SQL statement built from a yii\db\Query object. You can achieve this goal with the following code:

$command = (new \yii\db\Query())
    ->select(['id', 'email'])
    ->from('user')
    ->where(['last_name' => 'Smith'])
    ->limit(10)
    ->createCommand();
    
// show the SQL statement
echo $command->sql;
// show the parameters to be bound
print_r($command->params);

// returns all rows of the query result
$rows = $command->queryAll();

Indexing Query Results

When you call all(), it will return an array of rows which are indexed by consecutive integers. Sometimes you may want to index them differently, such as indexing by a particular column or expression values. You can achieve this goal by calling indexBy() before all(). For example,

// returns [100 => ['id' => 100, 'username' => '...', ...], 101 => [...], 103 => [...], ...]
$query = (new \yii\db\Query())
    ->from('user')
    ->limit(10)
    ->indexBy('id')
    ->all();

To index by expression values, pass an anonymous function to the indexBy() method:

$query = (new \yii\db\Query())
    ->from('user')
    ->indexBy(function ($row) {
        return $row['id'] . $row['username'];
    })->all();

The anonymous function takes a parameter $row which contains the current row data and should return a scalar value which will be used as the index value for the current row.

Note: In contrast to query methods like groupBy() or orderBy() which are converted to SQL and are part of the query, this method works after the data has been fetched from the database. That means that only those column names can be used that have been part of SELECT in your query. Also if you selected a column with table prefix, e.g. customer.id, the result set will only contain id so you have to call ->indexBy('id') without table prefix.

Batch Query

When working with large amounts of data, methods such as yii\db\Query::all() are not suitable because they require loading the whole query result into the client's memory. To solve this issue Yii provides batch query support. The server holds the query result, and the client uses a cursor to iterate over the result set one batch at a time.

Warning: There are known limitations and workarounds for the MySQL implementation of batch queries. See below.

Batch query can be used like the following:

use yii\db\Query;

$query = (new Query())
    ->from('user')
    ->orderBy('id');

foreach ($query->batch() as $users) {
    // $users is an array of 100 or fewer rows from the user table
}

// or to iterate the row one by one
foreach ($query->each() as $user) {
    // data is being fetched from the server in batches of 100,
    // but $user represents one row of data from the user table
}

The method yii\db\Query::batch() and yii\db\Query::each() return an yii\db\BatchQueryResult object which implements the Iterator interface and thus can be used in the foreach construct. During the first iteration, a SQL query is made to the database. Data is then fetched in batches in the remaining iterations. By default, the batch size is 100, meaning 100 rows of data are being fetched in each batch. You can change the batch size by passing the first parameter to the batch() or each() method.

Compared to the yii\db\Query::all(), the batch query only loads 100 rows of data at a time into the memory.

If you specify the query result to be indexed by some column via yii\db\Query::indexBy(), the batch query will still keep the proper index.

For example:

$query = (new \yii\db\Query())
    ->from('user')
    ->indexBy('username');

foreach ($query->batch() as $users) {
    // $users is indexed by the "username" column
}

foreach ($query->each() as $username => $user) {
    // ...
}

Limitations of batch query in MySQL

MySQL implementation of batch queries relies on the PDO driver library. By default, MySQL queries are buffered. This defeats the purpose of using the cursor to get the data, because it doesn't prevent the whole result set from being loaded into the client's memory by the driver.

Note: When libmysqlclient is used (typical of PHP5), PHP's memory limit won't count the memory used for result sets. It may seem that batch queries work correctly, but in reality the whole dataset is loaded into client's memory, and has the potential of using it up.

To disable buffering and reduce client memory requirements, PDO connection property PDO::MYSQL_ATTR_USE_BUFFERED_QUERY must be set to false. However, until the whole dataset has been retrieved, no other query can be made through the same connection. This may prevent ActiveRecord from making a query to get the table schema when it needs to. If this is not a problem (the table schema is cached already), it is possible to switch the original connection into unbuffered mode, and then roll back when the batch query is done.

Yii::$app->db->pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

// Do batch query

Yii::$app->db->pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);

Note: In the case of MyISAM, for the duration of the batch query, the table may become locked, delaying or denying write access for other connections. When using unbuffered queries, try to keep the cursor open for as little time as possible.

If the schema is not cached, or it is necessary to run other queries while the batch query is being processed, you can create a separate unbuffered connection to the database:

$unbufferedDb = new \yii\db\Connection([
    'dsn' => Yii::$app->db->dsn,
    'username' => Yii::$app->db->username,
    'password' => Yii::$app->db->password,
    'charset' => Yii::$app->db->charset,
]);
$unbufferedDb->open();
$unbufferedDb->pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

If you want to ensure that the $unbufferedDb has exactly the same PDO attributes like the original buffered $db but the PDO::MYSQL_ATTR_USE_BUFFERED_QUERY is false, consider a deep copy of $db, set it to false manually.

Then, queries are created normally. The new connection is used to run batch queries and retrieve results either in batches or one by one:

// getting data in batches of 1000
foreach ($query->batch(1000, $unbufferedDb) as $users) {
    // ...
}


// data is fetched from server in batches of 1000, but is iterated one by one 
foreach ($query->each(1000, $unbufferedDb) as $user) {
    // ...
}

When the connection is no longer necessary and the result set has been retrieved, it can be closed:

$unbufferedDb->close();

Note: unbuffered query uses less memory on the PHP-side, but can increase the load on the MySQL server. It is recommended to design your own code with your production practice for extra massive data, for example, divide the range for integer keys, loop them with Unbuffered Queries.