PHP memory optimisation ideas

My vague rant about PHP 5.3’s memory usage on php.internals turned into something potentially more useful when Stanislav Malyshev (a.k.a. Stas) started responding to it in an intelligent way, forcing me to come up with some more concrete ideas and to justify them. Some of the resulting text is quoted below, edited so that it makes sense in this format.

$m = memory_get_usage();
$a = explode(',', str_repeat(',', 100000));
print (memory_get_usage() - $m)/100000;

This is said to use 170 to 260 bytes per element on a 64-bit architecture. I think this is excessive.

Stas: I do not see what could be removed from Bucket or zval without hurting the functionality.

Tim: Right, and that’s why PHP is so bad compared to other languages. Its one-size-fits-all data structure has to store a lot of data per element to support every possible use case. However, there is room for optimisation. For instance, an array could start off as being like a C++ std::vector. Then when someone inserts an item into it with a non-integer key, it could be converted to a hashtable. This could potentially give you a time saving as well, because conversion to a hashtable could resize the destination hashtable in one step instead of growing it O(log N) times.

Some other operations, like deleting items from the middle of the array or adding items past the end (leaving gaps) would also have to trigger conversion. The point would be to optimise the most common use cases for integer-indexed arrays.

What about objects that can optionally pack themselves into a class-dependent structure and unpack on demand?

Stas: Objects can do pretty much anything in Zend Engine now, provided you do some C 🙂 For the engine, object is basically a pointer and an integer, the rest is changeable. Of course, on PHP level we need to have more, but that’s because certain things just not doable on PHP level. Do you have some specific use case that would allow to reduce memory usage?

Tim: Basically I’m thinking along the same lines as the array optimisation I suggested above. For my sample class, the zend_class_entry would have a hashtable like:

v1 => 0, v2 => 1, v3 => 2, v4 => 3, v5 => 4, v6 => 5, v7 => 6, v8 =>7, v9 => 8, v10 => 9

The class is:

class C { var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10; }

Then the object could be stored as a zval[10]. Object member access would be implemented by looking up the member name in the class entry hashtable and then using the resulting index into the zval[10]. When the object is unpacked (say if the user creates or deletes object members at runtime), then the object value becomes a hashtable.

Stas: That would mean having 2 object types – “packed” and “unpacked” with all (most of) operations basically duplicated. However, for objects it’s easier than for arrays since objects API is more abstract. I’m not sure that would improve situation though – a lot of objects are dynamic and for those it would mean a penalty when the object is unpacked.

But this can be tested on the current engine (maybe even without breaking BC!) and if it gives good results it may be an option.

Tim: What about an oparray format with less 64-bit pointers and more smallish integers?

Stas: I’m not sure how the data op array needs can be stored without using pointers.

Tim: Making oplines use a variable amount of memory (like they do in machine code) would be a great help.

For declarations, you could pack structures like zend_class_entry and zend_function_entry on to the end of the opline, and access them by casting the opline to the appropriate opcode-specific type. That would save pointers and also allocator overhead.

At the more extreme end of the spectrum, the compiler could produce a pointerless oparray, like JVM bytecode. Then when a function is executed for the first time, the oparray could be expanded, with pointers added, and the result cached. This would reduce memory usage for code which is never executed. And it would have the added advantage of making APC easier to implement, since it could just copy the whole unexpanded oparray with memcpy().

Stas: opcodes can be cached (bytecode caches do it) but op_array can’t really be cached between requests because it contains dynamic structures. Unlike Java, PHP does full cleanup after each request, which means no preserving dynamic data.

Tim: APC deep-copies the whole zend_op_array, see apc_copy_op_array() in apc_compile.c. It does it using an impressive pile of hacks which break with every major release and in some minor releases too. Every time the compiler allocates memory, there has to be a matching shared memory allocation in APC.

But maybe you missed my point. I’m talking about a cache which is cheap to construct and cleared at the end of each request. It would optimise tight loops of calls to user-defined functions. The dynamic data, like static variable hashtables, would be in it. The compact pointerless structure could be stored between requests, and would not contain dynamic data.

Basically a structure like the current zend_op_array would be created on demand by the executor instead of in advance by the compiler.

Stas: I’m not sure how using pointers in op_array in such manner would help though – you’d still need to store things like function names, for example, and since you need to store it somewhere, you’d also have some pointer to this place. Same goes for a bunch of other op_array’s properties – you’d need to store them somewhere and be able to find them, so I don’t see how you’d do it without a pointer of some kind involved.

Tim: You can do it with a length field and a char[1] at the end of the structure. When you allocate memory for the structure, you add some on for the string. Then you copy the string into the char[1], overflowing it.

If you need several strings, then you can have several byte offsets, which are added to the start of the char[1] to find the location of the string in question. You can make the offset fields small, say 16 bits.

But it’s mostly zend_op I’m interested in rather than zend_op_array. Currently if a zend_op has a string literal argument, you’d make a zval for it and copy it into op1.u.constant. But the zval allocation could be avoided. The handler could cast the zend_op to a zend_op_with_a_string, which would have a length field and an overflowed char[1] at the end for the string argument.

A variable op size would make iterating through zend_op_array.opcodes slightly more awkward, something like:

for (; op < oparray_end; op = (zend_op*)((char*)op + op->size)) {

But obviously you could clean that up with a macro.

For the skeptical Mr. “everyone has 8GB of memory and tiny little data sets” Lerdorf, I could point out that reducing the average zend_op size and placing strings close to other op data will also make execution faster, due to the improved CPU cache hit rate.

> I do not see what could be removed from Bucket or zval without hurting
> the functionality.

Leave a Reply

Your email address will not be published. Required fields are marked *