You are here
Optimize PHP memory usage: eliminate circular references
Tree-like structures are very common in programming world and you may see some type of the parent-child relationship between classes a lot of times. And it is where PHP has a problem and you, as a developer, should care about it.
PHP has a build-in garbage collector so you do not need to track the links on the objects, allocate memory for objects and delete them when they are not longer necessary. Things seem so perfect that developers do not even know that their scripts allocate a lot of memory until their server stops processing requests because of the out of memory error.
Consider the simple scenario when we need to build a parent-child relationship multiple times:
class Node {
public $parentNode;
public $childNodes = array();
function Node() {
$this->nodeValue = str_repeat('0123456789', 128);
}
}
function createRelationship() {
$parent = new Node();
$child = new Node();
$parent->childNodes[] = $child;
$child->parentNode = $parent;
}And then let's review the amount of memory allocated by this script after 10,000 calls to createRelationship function.
echo 'Initial: ' . number_format(memory_get_usage(), 0, '.', ',') . " bytes\n";
for($i = 0; $i < 10000; $i++) {
createRelationship();
}
echo 'Peak: ' . number_format(memory_get_peak_usage(), 0, '.', ',') . " bytes\n";
echo 'End: ' . number_format(memory_get_usage(), 0, '.', ',') . " bytes\n";And the output:
Initial: 327,336 bytes Peak: 35,824,176 bytes End: 35,823,328 bytes
So, such simple script allocate 34Mb. Systems with 1-2Gb can run 30-60 similar processes at the same time. But in shared hosting environment or when support of 100 simultaneous connections is a system requirement, such memory consumption is inadequate.
Understanding of processes that are behind this code is absolutely necessary to improving it and avoiding similar memory problems in future. And the key phrase is circular reference. Detecting them is very complex problem for garbage collectors because all in-memory objects should be analyzed. This is very time-consuming operation and PHP engine developers decided to just destroy all objects on script shutdown and do not share anything between scripts. So the PHP strategy is being faster instead of being memory aware. Honestly speaking, this strategy works well and the developer may never have to write some special code for destroying objects. But this is not a reason to ignoring this aspect of PHP.
Let's back to the createRelationship() function. Garbage collector destroys objects that are out of scope and no other objects refer to them. So, you may expect $child and $parent objects destroyed when PHP leaves the function but each of the objects refer to each other so PHP cannot delete $child because $parent refers to it and vise-versa.
The common solution for this is to create special destructor that deletes references or delete them when they are not longer necessary. Adding this code into the "magic" __destruct() method is impossible because __destruct() is called by PHP when it detects that an object can be destroyed, but for $child and $parent it happens too late (on request shutdown).
So, let's add new method to the Node object and call it for $parent in the createRelationship() function:
class Node {
public $parentNode;
public $childNodes = array();
function Node() {
$this->nodeValue = str_repeat('0123456789', 128);
}
function destroy()
{
$this->parentNode = null;
$this->childNodes = array();
}
}
function createRelationship() {
$parent = new Node();
$child = new Node();
$parent->childNodes[] = $child;
$child->parentNode = $parent;
$parent->destroy();
}Being run again with the modified version of the createRelationship() function the test displays the following results:
Initial: 328,416 bytes Peak: 335,304 bytes End: 328,520 bytes
Another solution that become available in PHP 5.3 is to use new garbage collector that analyzes circular references between objects and destroys the unused objects more efficiently.
With the new garbage collector turned on the first test (without destroy) shows much better result:
Initial: 327,136 bytes Peak: 18,059,504 bytes End: 825,656 bytes
From one point it in two times better then early versions of PHP. From another - it is still as far from efficient memory usage as 17 from 0.06.
Relying on a garbage collector for a memory critical applications is not a good idea just because garbage collector does not work all the time. It does not scan all the memory data on return of every function because it would be very slow. Instead, as in other programming languages such as C# and Java, you can forcibly start garbage collector in the moment you think is the best for cleaning up the memory by calling gc_collect_cycles. For example, before loading a big file into memory.
Result of the createRelationship function with gc_collect_cycles is identical to the result of the createRelationship with destroy() method:
function createRelationship() {
$parent = new Node();
$child = new Node();
$parent->childNodes[] = $child;
$child->parentNode = $parent;
gc_collect_cycles();
}
Result:
Initial: 327,264 bytes Peak: 335,112 bytes End: 330,816 bytes
New garbage collector is good but it adds performance drawback. The following table shows how turning garbage collector on and off and gc_collect_cycles may change the performance:
| gc | memory cleanup | time (ms) | memory (max/end, MB, rounded) |
| on | gc_collect_cycles() | 43 | 0/0 |
| on | destroy() | 44 | 0/0 |
| on | - | 74 | 18/0 |
| off | gc_collect_cycles() | 43 | 0/0 |
| off | destroy() | 46 | 0/0 |
| off | - | 49 | 35/35 |
Summary: the garbage collector exchanges memory on performance, with your help this exchange becomes more profitable for your scripts.
Please note that all comments that look like "Thank you! This is exactly what I've looked for! You are THE GREAT! My site with flash games" will be immediately deleted without any compunction and your IP will be reported to mollom and added to the spamlists. Thank you for understanding.



Comments
Trying to add gc_collect_cycles();
Hi Alex - this is a really good article and extremely practical as well.
Thanks!
I'm trying to add gc_collect_cycles(); to my shopping cart page before the checkout page (which has a lot of scripting) to free up memory.
Then I want to add it again at the checkout success page to free up that memory as well.
On my home dev box I've been able to add it fine and no errors are coming up on the page so I expect it's running no problems.
But when I add the call to my GoDaddy shared hosting pages I get the error; Call to Undefined Function.. etc.
My dev server is on PHP5.2.1 and the GoDaddy server is on PHP5.2.14.
Any reasons do you think I am getting the error on GoDaddy?
Thanks for your help,
tL
Another question
The question why you have not this error in your development environment is more important in this case, I think. New memory management functions are added only in PHP 5.3 branch so they should not exist in 5.2.1, nor in 5.2.14.
You can try to run the following code:
var_dump(phpversion());
var_dump(function_exists('gc_collect_cycles'));
You should have something like "5.2.1 - false" for your dev environment and "5.2.14 - false" for your hosting environment.
Regards,
Alex
Great Article
Great article. I have a memory issue with wp-e-commerce plug-in for wordpress. Problem now is, its not my code so to fix isnhectic. When it gives the error of out of memory. The page it references to, is that the page with the memory leak( if I can call it that) ? Thanks.
Thanks Alex, nice one.Even
Thanks Alex, nice one.Even though I couldnt see the performance drawback, but I do write very long running scipts.Which have to run thousands of time, so memory leak was an interesting subject.
The technique we use is more into running the scirpts within a loop with system() and stop after certain cycles to release memory.But I wasnt very happy with this solutions, so I have been searching for an alternative for a while now.
Cheers.
Add new comment