Tree-like structures are very common in programming world and you may see some type of the parent-child relationship between classes a lot of times. And it is where PHP has a problem and you, as a developer, should care about it.
PHP has a build-in garbage collector so you do not need to track the links on the objects, allocate memory for objects and delete them when they are not longer necessary. Things seem so perfect that developers do not even know that their scripts allocate a lot of memory until their server stops processing requests because of the out of memory error.
Consider the simple scenario when we need to build a parent-child relationship multiple times:
class Node {
public $parentNode;
public $childNodes = array();
function Node() {
$this->nodeValue = str_repeat('0123456789', 128);
}
}
function createRelationship() {
$parent = new Node();
$child = new Node();
$parent->childNodes[] = $child;
$child->parentNode = $parent;
}
And then let's review the amount of memory allocated by this script after 10,000 calls to createRelationship function.
echo 'Initial: ' . number_format(memory_get_usage(), 0, '.', ',') . " bytes\n";
for($i = 0; $i < 10000; $i++) {
createRelationship();
}
echo 'Peak: ' . number_format(memory_get_peak_usage(), 0, '.', ',') . " bytes\n";
echo 'End: ' . number_format(memory_get_usage(), 0, '.', ',') . " bytes\n";
And the output:
Initial: 62,224 bytes Peak: 34,905,216 bytes End: 34,905,016 bytes
So, such simple script allocate 34Mb. Modern systems with 1-2Gb can run 30-60 similar processes at the same time. But in shared hosting environment or when support of 100 simultaneous connections is a system requirement, such memory consumption is inadequate.
Understanding of processes that are behind this code is absolutely necessary to improving it and avoiding similar memory problems in future. And the key phrase is circular reference. Detecting them is very complex problem for garbage collectors because all in-memory objects should be analyzed. This is very time-consuming operation and PHP engine developers decided to just destroy all objects on script shutdown and do not share anything between scripts. So the PHP strategy is being faster instead of being memory aware. Honestly speaking, this strategy works well and the developer may never have to write some special code for destroying objects. But this is not a reason to ignoring this aspect of PHP.
Let's back to the createRelationship() function. Garbage collector destroys objects that are out of scope and no other objects refer to them. So, you may expect $child and $parent objects destroyed when PHP leaves the function but each of the objects refer to each other so PHP cannot delete $child because $parent refers to it and vise-versa.
The common solution for this is to create special destructor that deletes references or delete them when they are not longer necessary. Adding this code into the "magic" __destruct() method is impossible because __destruct() is called by PHP when it detects that an object can be destroyed, but for $child and $parent it happens too late (on request shutdown).
So, let's add new method to the Node object and call it for $parent in the createRelationship() function:
class Node {
public $parentNode;
public $childNodes = array();
function Node() {
$this->nodeValue = str_repeat('0123456789', 128);
}
function destroy()
{
$this->parentNode = null;
$this->childNodes = array();
}
}
function createRelationship() {
$parent = new Node();
$child = new Node();
$parent->childNodes[] = $child;
$child->parentNode = $parent;
$parent->destroy();
}
Being run again with the modified version of the createRelationship() function the test displays the following results:
Initial: 63,848 bytes Peak: 69,688 bytes End: 65,168 bytes
So, peak memory usage decreased in 500 times. What can be a better reason to lean the subject?
Update: Paul M. Jones has written about the same subject and related discussion lifted up the following news from php.internal to public: circular references patch will not be implemented in PHP 5.3 because it did not have many votes.
Comments
This issue will be solved in PHP 5.3
David Wang has written a patch for the garbage collector to pick up circular references. Implementing the patch has been placed on the PHP 5.3 Suggested Feature List.
http://news.php.net/php.internals/30790 (David Wangs e-mail)
http://news.php.net/php.internals/32330 (PHP 5.3 Suggested feature list)
Arnold Daniels
http://blog.adaniels.nl
Thank you!
Thank you!
This annoying issue was in PHP for years and I'm glad to see that PHP becomes better. Unfortunately, there are a lot of PHP4/5.2/5.1 hosting companies and application, which makes me believe that this information will be useful at least a couple of years or even more.