Commit 1571c722 by Qiang Xue

Fixes #2409: Added support for fetching data from database in batches

parent f198f655
......@@ -113,6 +113,25 @@ $customers = Customer::find()->indexBy('id')->all();
// $customers array is indexed by customer IDs
```
Batch query is also supported when working with Active Record. For example,
```php
// fetch 10 customers at a time
foreach (Customer::find()->batch() as $customers) {
// $customers is an array of 10 or fewer Customer objects
}
// fetch customers one by one
foreach (Customer::find()->each() as $customer) {
// $customer is a Customer object
}
// batch query with eager loading
foreach (Customer::find()->with('orders')->batch() as $customers) {
}
```
As explained in [Query Builder](query-builder.md), batch query is very useful when you are fetching
a large amount of data from database. It will keep your memory usage under a limit.
Accessing Column Data
---------------------
......
......@@ -13,30 +13,50 @@ $rows = (new \yii\db\Query)
->select('id, name')
->from('tbl_user')
->limit(10)
->createCommand()
->queryAll();
->all();
// which is equivalent to the following code:
$query = new \yii\db\Query;
$query->select('id, name')
$query = (new \yii\db\Query)
->select('id, name')
->from('tbl_user')
->limit(10);
// Create a command.
// Create a command. You can get the actual SQL using $command->sql
$command = $query->createCommand();
// You can get the actual SQL using $command->sql
// Execute the command:
$rows = $command->queryAll();
```
Query Methods
-------------
As you can see, [[yii\db\Query]] is the main player that you need to deal with. Behind the scene,
`Query` is actually only responsible for representing various query information. The actual query
building logic is done by [[yii\db\QueryBuilder]] when you call the `createCommand()` method,
and the query execution is done by [[yii\db\Command]].
For convenience, [[yii\db\Query]] provides a set of commonly used query methods that will build
the query, execute it, and return the result. For example,
- [[yii\db\Query::all()|all()]]: builds the query, executes it and returns all results as an array.
- [[yii\db\Query::one()|one()]]: returns the first row of the result.
- [[yii\db\Query::column()|column()]]: returns the first column of the result.
- [[yii\db\Query::scalar()|scalar()]]: returns the first column in the first row of the result.
- [[yii\db\Query::exists()|exists()]]: returns a value indicating whether the query results in anything.
- [[yii\db\Query::count()|count()]]: returns the result of a `COUNT` query. Other similar methods
include `sum()`, `average()`, `max()`, `min()`, which support the so-called aggregational data query.
Building Query
--------------
In the following, we will explain how to build various clauses in a SQL statement. For simplicity,
we use `$query` to represent a [[yii\db\Query]] object.
`SELECT`
--------
### `SELECT`
In order to form a basic `SELECT` query, you need to specify what columns to select and from what table:
......@@ -68,8 +88,7 @@ To select distinct rows, you may call `distinct()`, like the following:
$query->select('user_id')->distinct()->from('tbl_post');
```
`FROM`
------
### `FROM`
To specify which table(s) to select data from, call `from()`:
......@@ -102,44 +121,7 @@ $query->select('*')->from(['u' => $subQuery]);
```
`JOIN`
-----
The `JOIN` clauses are generated in the Query Builder by using the applicable join method:
- `innerJoin()`
- `leftJoin()`
- `rightJoin()`
This left join selects data from two related tables in one query:
```php
$query->select(['tbl_user.name AS author', 'tbl_post.title as title'])
->from('tbl_user')
->leftJoin('tbl_post', 'tbl_post.user_id = tbl_user.id');
```
In the code, the `leftJoin()` method's first parameter
specifies the table to join to. The second parameter defines the join condition.
If your database application supports other join types, you can use those via the generic `join` method:
```php
$query->join('FULL OUTER JOIN', 'tbl_post', 'tbl_post.user_id = tbl_user.id');
```
The first argument is the join type to perform. The second is the table to join to, and the third is the condition.
Like `FROM`, you may also join with sub-queries. To do so, specify the sub-query as an array
which must contain one element. The array value must be a `Query` object representing the sub-query,
while the array key is the alias for the sub-query. For example,
```php
$query->leftJoin(['u' => $subQuery], 'u.id=author_id');
```
`WHERE`
-------
### `WHERE`
Usually data is selected based upon certain criteria. Query Builder has some useful methods to specify these, the most powerful of which being `where`. It can be used in multiple ways.
......@@ -250,8 +232,7 @@ In case `$search` isn't empty the following SQL will be generated:
WHERE (`status` = 10) AND (`title` LIKE '%yii%')
```
`ORDER BY`
-----
### `ORDER BY`
For ordering results `orderBy` and `addOrderBy` could be used:
......@@ -266,8 +247,7 @@ Here we are ordering by `id` ascending and then by `name` descending.
```
Group and Having
----------------
### `GROUP BY` and `HAVING`
In order to add `GROUP BY` to generated SQL you can use the following:
......@@ -288,8 +268,7 @@ for these are similar to the ones for `where` methods group:
$query->having(['status' => $status]);
```
Limit and offset
----------------
### `LIMIT` and `OFFSET`
To limit result to 10 rows `limit` can be used:
......@@ -303,8 +282,43 @@ To skip 100 fist rows use:
$query->offset(100);
```
Union
-----
### `JOIN`
The `JOIN` clauses are generated in the Query Builder by using the applicable join method:
- `innerJoin()`
- `leftJoin()`
- `rightJoin()`
This left join selects data from two related tables in one query:
```php
$query->select(['tbl_user.name AS author', 'tbl_post.title as title'])
->from('tbl_user')
->leftJoin('tbl_post', 'tbl_post.user_id = tbl_user.id');
```
In the code, the `leftJoin()` method's first parameter
specifies the table to join to. The second parameter defines the join condition.
If your database application supports other join types, you can use those via the generic `join` method:
```php
$query->join('FULL OUTER JOIN', 'tbl_post', 'tbl_post.user_id = tbl_user.id');
```
The first argument is the join type to perform. The second is the table to join to, and the third is the condition.
Like `FROM`, you may also join with sub-queries. To do so, specify the sub-query as an array
which must contain one element. The array value must be a `Query` object representing the sub-query,
while the array key is the alias for the sub-query. For example,
```php
$query->leftJoin(['u' => $subQuery], 'u.id=author_id');
```
### `UNION`
`UNION` in SQL adds results of one query to results of another query. Columns returned by both queries should match.
In Yii in order to build it you can first form two query objects and then use `union` method:
......@@ -319,3 +333,55 @@ $anotherQuery->select('id, 'user' as type, name')->from('tbl_user')->limit(10);
$query->union($anotherQuery);
```
Batch Query
-----------
When working with large amount of data, methods such as [[yii\db\Query::all()]] are not suitable
because they require loading all data into the memory. To keep the memory requirement low, Yii
provides the so-called batch query support. A batch query makes uses of data cursor and fetches
data in batches.
Batch query can be used like the following:
```php
use yii\db\Query;
$query = (new Query)
->from('tbl_user')
->orderBy('id');
foreach ($query->batch(10) as $users) {
// $users is an array of 10 or fewer rows from the user table
}
```
The method [[yii\db\Query::batch()]] returns an [[yii\db\BatchQueryResult]] object which implements
the `Iterator` interface and thus can be used in the `foreach` construct. For each iterator,
it returns an array of query result. The size of the array is determined by the so-called batch
size, which is the first parameter (defaults to 10) to the method.
Compared to the `$query->all()` call, the above code only loads 10 rows of data at a time into the memory.
If you process the data and then discard it right away, the batch query can help keep the memory usage under a limit.
Note that in the special case when you specify the batch size as 1, each iteration of the batch query
only returns a single row of data, rather than an array of a row. In this case, you may also use
the shortcut method [[yii\db\Query::each()]]. For example,
```php
use yii\db\Query;
$query = (new Query)
->from('tbl_user')
->orderBy('id');
foreach ($query->each() as $user) {
// $user represents a row from the user table
}
// the above code is equivalent to the following:
foreach ($query->batch(1) as $user) {
// $user represents a row from the user table
}
```
......@@ -109,6 +109,7 @@ Yii Framework 2 Change Log
- Enh #2240: Improved `yii\web\AssetManager::publish()`, `yii\web\AssetManager::getPublishedPath()` and `yii\web\AssetManager::getPublishedUrl()` to support aliases (vova07)
- Enh #2325: Adding support for the `X-HTTP-Method-Override` header in `yii\web\Request::getMethod()` (pawzar)
- Enh #2364: Take into account current error reporting level in error handler (gureedo)
- Enh #2409: Added support for fetching data from database in batches (nineinchnick, qiangxue)
- Enh #2417: Added possibility to set `dataType` for `$.ajax` call in yii.activeForm.js (Borales)
- Enh: Added support for using arrays as option values for console commands (qiangxue)
- Enh: Added `favicon.ico` and `robots.txt` to default application templates (samdark)
......
......@@ -64,25 +64,28 @@ class ActiveQuery extends Query implements ActiveQueryInterface
*/
public function all($db = null)
{
$command = $this->createCommand($db);
$rows = $command->queryAll();
if (!empty($rows)) {
$models = $this->createModels($rows);
if (!empty($this->join) && $this->indexBy === null) {
$models = $this->removeDuplicatedModels($models);
}
if (!empty($this->with)) {
$this->findWith($this->with, $models);
}
if (!$this->asArray) {
foreach($models as $model) {
$model->afterFind();
}
}
return $models;
} else {
return parent::all($db);
}
public function prepareResult($rows)
{
if (empty($rows)) {
return [];
}
$models = $this->createModels($rows);
if (!empty($this->join) && $this->indexBy === null) {
$models = $this->removeDuplicatedModels($models);
}
if (!empty($this->with)) {
$this->findWith($this->with, $models);
}
if (!$this->asArray) {
foreach($models as $model) {
$model->afterFind();
}
}
return $models;
}
/**
......
<?php
/**
* @link http://www.yiiframework.com/
* @copyright Copyright (c) 2008 Yii Software LLC
* @license http://www.yiiframework.com/license/
*/
namespace yii\db;
use yii\base\Object;
/**
* BatchQueryResult represents the query result from which you can retrieve the data in batches.
*
* BatchQueryResult is mainly used with [[Query::batch()]].
* @author Qiang Xue <qiang.xue@gmail.com>
* @since 2.0
*/
class BatchQueryResult extends Object implements \Iterator
{
/**
* @var Connection
*/
public $db;
/**
* @var Query
*/
public $query;
/**
* @var integer
*/
public $batchSize = 10;
/**
* @var DataReader
*/
public $dataReader;
private $_data;
private $_index = -1;
public function __destruct()
{
$this->reset();
}
public function reset()
{
if ($this->dataReader !== null) {
$this->dataReader->close();
}
$this->dataReader = null;
$this->_data = null;
$this->_index = -1;
}
/**
* Resets the iterator to the initial state.
* This method is required by the interface Iterator.
*/
public function rewind()
{
$this->reset();
$this->next();
}
/**
* Returns the index of the current row.
* This method is required by the interface Iterator.
* @return integer the index of the current row.
*/
public function key()
{
return $this->_index;
}
/**
* Returns the current row.
* This method is required by the interface Iterator.
* @return mixed the current row.
*/
public function current()
{
return $this->_data;
}
/**
* Moves the internal pointer to the next row.
* This method is required by the interface Iterator.
*/
public function next()
{
if ($this->dataReader === null) {
$this->dataReader = $this->query->createCommand($this->db)->query();
$this->_index = 0;
} else {
$this->_index++;
}
$rows = [];
$count = 0;
while ($count++ < $this->batchSize && ($row = $this->dataReader->read())) {
$rows[] = $row;
}
if (empty($rows)) {
$this->_data = null;
} else {
$this->_data = $this->query->prepareResult($rows);
if ($this->batchSize == 1) {
$this->_data = reset($this->_data);
}
}
}
/**
* Returns whether there is a row of data at current position.
* This method is required by the interface Iterator.
* @return boolean whether there is a row of data at current position.
*/
public function valid()
{
return $this->_data !== null;
}
}
......@@ -123,6 +123,21 @@ class Query extends Component implements QueryInterface
return $db->createCommand($sql, $params);
}
public function batch($size = 10, $db = null)
{
return Yii::createObject([
'class' => BatchQueryResult::className(),
'query' => $this,
'batchSize' => $size,
'db' => $db,
]);
}
public function each($db = null)
{
return $this->batch(1, $db);
}
/**
* Executes the query and returns all results as an array.
* @param Connection $db the database connection used to generate the SQL statement.
......@@ -132,6 +147,11 @@ class Query extends Component implements QueryInterface
public function all($db = null)
{
$rows = $this->createCommand($db)->queryAll();
return $this->prepareResult($rows);
}
public function prepareResult($rows)
{
if ($this->indexBy === null) {
return $rows;
}
......
<?php
/**
* @link http://www.yiiframework.com/
* @copyright Copyright (c) 2008 Yii Software LLC
* @license http://www.yiiframework.com/license/
*/
namespace yiiunit\framework\db;
use Yii;
use yiiunit\data\ar\ActiveRecord;
use yii\db\Query;
use yii\db\BatchQueryResult;
use yiiunit\data\ar\Customer;
/**
* @author Qiang Xue <qiang.xue@gmail.com>
* @since 2.0
*/
class BatchQueryResultTest extends DatabaseTestCase
{
public function setUp()
{
parent::setUp();
ActiveRecord::$db = $this->getConnection();
}
public function testQuery()
{
$db = $this->getConnection();
// initialize property test
$query = new Query();
$query->from('tbl_customer')->orderBy('id');
$result = $query->batch(2, $db);
$this->assertTrue($result instanceof BatchQueryResult);
$this->assertEquals(2, $result->batchSize);
$this->assertNull($result->dataReader);
$this->assertTrue($result->query === $query);
// normal query
$query = new Query();
$query->from('tbl_customer')->orderBy('id');
$allRows = [];
$batch = $query->batch(2, $db);
foreach ($batch as $rows) {
$allRows = array_merge($allRows, $rows);
}
$this->assertEquals(3, count($allRows));
$this->assertEquals('user1', $allRows[0]['name']);
$this->assertEquals('user2', $allRows[1]['name']);
$this->assertEquals('user3', $allRows[2]['name']);
// rewind
$allRows = [];
foreach ($batch as $rows) {
$allRows = array_merge($allRows, $rows);
}
$this->assertEquals(3, count($allRows));
// reset
$batch->reset();
$this->assertNull($batch->dataReader);
// query with index
$query = new Query();
$query->from('tbl_customer')->indexBy('name');
$allRows = [];
foreach ($query->batch(2, $db) as $rows) {
$allRows = array_merge($allRows, $rows);
}
$this->assertEquals(3, count($allRows));
$this->assertEquals('address1', $allRows['user1']['address']);
$this->assertEquals('address2', $allRows['user2']['address']);
$this->assertEquals('address3', $allRows['user3']['address']);
// query in batch 1
$query = new Query();
$query->from('tbl_customer')->orderBy('id');
$allRows = [];
foreach ($query->batch(1, $db) as $rows) {
$allRows[] = $rows;
}
$this->assertEquals(3, count($allRows));
$this->assertEquals('user1', $allRows[0]['name']);
$this->assertEquals('user2', $allRows[1]['name']);
$this->assertEquals('user3', $allRows[2]['name']);
$query = new Query();
$query->from('tbl_customer')->orderBy('id');
$allRows = [];
foreach ($query->each($db) as $rows) {
$allRows[] = $rows;
}
$this->assertEquals(3, count($allRows));
$this->assertEquals('user1', $allRows[0]['name']);
$this->assertEquals('user2', $allRows[1]['name']);
$this->assertEquals('user3', $allRows[2]['name']);
}
public function testActiveQuery()
{
$db = $this->getConnection();
$query = Customer::find()->orderBy('id');
$customers = [];
foreach ($query->batch(2, $db) as $models) {
$customers = array_merge($customers, $models);
}
$this->assertEquals(3, count($customers));
$this->assertEquals('user1', $customers[0]->name);
$this->assertEquals('user2', $customers[1]->name);
$this->assertEquals('user3', $customers[2]->name);
// query in batch 1
$query = Customer::find()->orderBy('id');
$customers = [];
foreach ($query->batch(1, $db) as $model) {
$customers[] = $model;
}
$this->assertEquals(3, count($customers));
$this->assertEquals('user1', $customers[0]->name);
$this->assertEquals('user2', $customers[1]->name);
$this->assertEquals('user3', $customers[2]->name);
// batch with eager loading
$query = Customer::find()->with('orders')->orderBy('id');
$customers = [];
foreach ($query->batch(2, $db) as $models) {
$customers = array_merge($customers, $models);
foreach ($models as $model) {
$this->assertTrue($model->isRelationPopulated('orders'));
}
}
$this->assertEquals(3, count($customers));
$this->assertEquals(1, count($customers[0]->orders));
$this->assertEquals(2, count($customers[1]->orders));
$this->assertEquals(0, count($customers[2]->orders));
}
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment