Measuring memory usage with strace

June 23rd, 2010

In the tradition of abusing high-level Linux tools to produce useful low-level data, I present a method for estimating peak memory usage in Linux by text-processing the output from strace:

This Perl script invokes an arbitrary command via strace. It adds up memory allocated by mmap2() with no location hint and the file handle set to -1, this is the way that malloc() typically allocates large amounts of memory. It also counts calls to brk(), and subtracts the sizes of munmap() calls for maps that were previously counted. It outputs the current memory usage rounded off to the nearest megabyte, whenever that number changes.

Other methods for measuring peak memory usage typically revolve around polling /proc for resident set size (RSS), this potentially misses short-lived allocations. The GNU time command (/usr/bin/time, not the one built in to bash) can show peak RSS, but in some applications this can be a vast overestimate of physical memory usage, due to the way Linux counts RSS.

My method provides a reasonable estimate of the amount of memory allocated with malloc(). That can be a useful thing to know.

PHP memory optimisation ideas

January 14th, 2010

My vague rant about PHP 5.3’s memory usage on php.internals turned into something potentially more useful when Stanislav Malyshev (a.k.a. Stas) started responding to it in an intelligent way, forcing me to come up with some more concrete ideas and to justify them. Some of the resulting text is quoted below, edited so that it makes sense in this format.

<?php
$m = memory_get_usage();
$a = explode(',', str_repeat(',', 100000));
print (memory_get_usage() - $m)/100000;
?>

This is said to use 170 to 260 bytes per element on a 64-bit architecture. I think this is excessive.

Stas: I do not see what could be removed from Bucket or zval without hurting the functionality.

Tim: Right, and that’s why PHP is so bad compared to other languages. Its one-size-fits-all data structure has to store a lot of data per element to support every possible use case. However, there is room for optimisation. For instance, an array could start off as being like a C++ std::vector. Then when someone inserts an item into it with a non-integer key, it could be converted to a hashtable. This could potentially give you a time saving as well, because conversion to a hashtable could resize the destination hashtable in one step instead of growing it O(log N) times.

Some other operations, like deleting items from the middle of the array or adding items past the end (leaving gaps) would also have to trigger conversion. The point would be to optimise the most common use cases for integer-indexed arrays.

What about objects that can optionally pack themselves into a class-dependent structure and unpack on demand?

Stas: Objects can do pretty much anything in Zend Engine now, provided you do some C :) For the engine, object is basically a pointer and an integer, the rest is changeable. Of course, on PHP level we need to have more, but that’s because certain things just not doable on PHP level. Do you have some specific use case that would allow to reduce memory usage?

Tim: Basically I’m thinking along the same lines as the array optimisation I suggested above. For my sample class, the zend_class_entry would have a hashtable like:

v1 => 0, v2 => 1, v3 => 2, v4 => 3, v5 => 4, v6 => 5, v7 => 6, v8 =>7, v9 => 8, v10 => 9

The class is:

class C { var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10; }

Then the object could be stored as a zval[10]. Object member access would be implemented by looking up the member name in the class entry hashtable and then using the resulting index into the zval[10]. When the object is unpacked (say if the user creates or deletes object members at runtime), then the object value becomes a hashtable.

Stas: That would mean having 2 object types – “packed” and “unpacked” with all (most of) operations basically duplicated. However, for objects it’s easier than for arrays since objects API is more abstract. I’m not sure that would improve situation though – a lot of objects are dynamic and for those it would mean a penalty when the object is unpacked.

But this can be tested on the current engine (maybe even without breaking BC!) and if it gives good results it may be an option.

Tim: What about an oparray format with less 64-bit pointers and more smallish integers?

Stas: I’m not sure how the data op array needs can be stored without using pointers.

Tim: Making oplines use a variable amount of memory (like they do in machine code) would be a great help.

For declarations, you could pack structures like zend_class_entry and zend_function_entry on to the end of the opline, and access them by casting the opline to the appropriate opcode-specific type. That would save pointers and also allocator overhead.

At the more extreme end of the spectrum, the compiler could produce a pointerless oparray, like JVM bytecode. Then when a function is executed for the first time, the oparray could be expanded, with pointers added, and the result cached. This would reduce memory usage for code which is never executed. And it would have the added advantage of making APC easier to implement, since it could just copy the whole unexpanded oparray with memcpy().

Stas: opcodes can be cached (bytecode caches do it) but op_array can’t really be cached between requests because it contains dynamic structures. Unlike Java, PHP does full cleanup after each request, which means no preserving dynamic data.

Tim: APC deep-copies the whole zend_op_array, see apc_copy_op_array() in apc_compile.c. It does it using an impressive pile of hacks which break with every major release and in some minor releases too. Every time the compiler allocates memory, there has to be a matching shared memory allocation in APC.

But maybe you missed my point. I’m talking about a cache which is cheap to construct and cleared at the end of each request. It would optimise tight loops of calls to user-defined functions. The dynamic data, like static variable hashtables, would be in it. The compact pointerless structure could be stored between requests, and would not contain dynamic data.

Basically a structure like the current zend_op_array would be created on demand by the executor instead of in advance by the compiler.

Stas: I’m not sure how using pointers in op_array in such manner would help though – you’d still need to store things like function names, for example, and since you need to store it somewhere, you’d also have some pointer to this place. Same goes for a bunch of other op_array’s properties – you’d need to store them somewhere and be able to find them, so I don’t see how you’d do it without a pointer of some kind involved.

Tim: You can do it with a length field and a char[1] at the end of the structure. When you allocate memory for the structure, you add some on for the string. Then you copy the string into the char[1], overflowing it.

If you need several strings, then you can have several byte offsets, which are added to the start of the char[1] to find the location of the string in question. You can make the offset fields small, say 16 bits.

But it’s mostly zend_op I’m interested in rather than zend_op_array. Currently if a zend_op has a string literal argument, you’d make a zval for it and copy it into op1.u.constant. But the zval allocation could be avoided. The handler could cast the zend_op to a zend_op_with_a_string, which would have a length field and an overflowed char[1] at the end for the string argument.

A variable op size would make iterating through zend_op_array.opcodes slightly more awkward, something like:

for (; op < oparray_end; op = (zend_op*)((char*)op + op->size)) {
   ...

But obviously you could clean that up with a macro.

For the skeptical Mr. “everyone has 8GB of memory and tiny little data sets” Lerdorf, I could point out that reducing the average zend_op size and placing strings close to other op data will also make execution faster, due to the improved CPU cache hit rate.

> I do not see what could be removed from Bucket or zval without hurting
> the functionality.

Response to mailing list posts about climate change action

December 16th, 2009

[Update: Aryeh has asked me to change instances of his well-known full name to his Wikipedia handle "Simetrical", to reduce the impact on Google.]

Simetrical has taken up an opposing position to my foundation-l post, which was copied to this blog under the title Should Wikimedia buy RECs. In the interests of avoiding offence to denizens of foundation-l, I am attempting to move that rather heated debate to here.

Simetrical attacks my views on multiple fronts, arguing (in my own words):

  • That Wikimedia should not spend money on causes unrelated to its mission;
  • That Wikimedia has no moral responsibility to take action on climate change, since it does not directly or voluntarily contribute to it;
  • That anthropogenic global warming (AGW) won’t have any significant effects for decades yet;
  • That AGW won’t directly cause human deaths, rather mere economic harm;
  • That future research may well make any present efforts redundant and, in hindsight, wasteful;
  • That action now may prove to be pointless since the impact of AGW may be catastrophic whatever we do;
  • That climate scientists do not understand the economics of mitigation and that serious economists, such as those behind the so-called Copenhagen Consensus, advocate alternative technologies such as albedo modification over mainstream approaches such as abatement and reforestation.

Quite a barrage. I’ve been taking these on point-by-point. Here are the archive links where you can read the full text of this debate:

I’ll post my latest response as a comment below. Let’s see if I can coax Wordpress into presenting a comment interface that’s usable for this purpose.

Should Wikimedia buy RECs?

December 14th, 2009

Should the Wikimedia Foundation do something about climate change? Here’s what I said on foundation-l:

Given the lack of political will to make deep cuts to greenhouse gas emissions, and the pitiful excuses politicians make for inaction; given the present nature of the debate, where special interests fund campaigns aimed at stalling any progress by appealing to the ignorance of the public; given the nature of the Foundation, an organisation which raises its funds and conducts most of its activities in the richest and most polluting country in the world: I think there is an argument for voluntary reduction of emissions by the Foundation.

I don’t mean by buying tree-planting or efficiency offsets, of which I am deeply skeptical. I think the best way for Wikimedia to take action on climate change would be by buying renewable energy certificates (RECs). Buying RECs from new wind and solar electricity generators is a robust way to reduce CO2 emissions, with minimal danger of double-counting, forward-selling, outright fraud, etc., problems which plague the offset industry.

If Domas Mituzas is correct, and Wikimedia uses on the order of 100kW for its servers, then buying a matching number of RECs would be a small portion of our hosting budget. If funding is nevertheless a problem, then we could have a restricted donation drive, and thereby get a clear mandate from our reader community.

Our colocation facilities would not need to do anything, such as changing their electricity provider. We would, however, need monitoring of our total electricity usage, so that we would know how many RECs to buy.

I’m not appealing to the PR benefits here, or to the way this action would promote the climate change cause in general. I’m just saying that as an organisation composed of rational, moral people, Wikimedia has as much responsibility to act as does any other organisation or individual.

Ultimately, the US will need to reduce its per-capita emissions by around 90% by 2050 to have any hope of avoiding catastrophe (see e.g. table 9.3 in the Garnaut Review, and chapter 4.3 for more context). Nature doesn’t have exemptions or loopholes, we can’t continue emitting by moving economic activity from corporations to charities.

Secure web uploads

December 16th, 2008

I’ve written hundreds of mailing list posts over the years, in my role first as a volunteer software developer and system administrator for Wikipedia, and later as an employee in the same role. But I’ve never had my own domain name, and I’ve never had a blog.

But I do have things to say, and I’ve often thought about setting up a soap box such as this, with the aim of reaching a wider audience than the mailing lists I usually post to. An important issue has finally come up, and I feel compelled to tell you about it. So I have created this blog.

The issue is a basic feature, which is present in many web applications: file uploads. Due to design choices by the browsers, particularly Internet Explorer, it turns out to be extremely difficult to allow users to upload arbitrary files, without endangering the security of the application.

We spent a lot of time working on secure uploads for MediaWiki, and we thought we had it more or less right. But it turns out that our handling of Internet Explorer wasn’t nearly rigorous enough, and there were still a number of ways to use file uploads to steal the authentication cookies of Internet Explorer users. In MediaWiki 1.13.3, I have, hopefully, closed these gaps. I did this by reverse-engineering three versions of Internet Explorer.

In the rest of this post, I’ll give a tutorial to building a file upload application, working through the security pitfalls from the most naive to the most subtle. I’ll use PHP in my examples, but none of the issues here are PHP-specific.

Read the rest of this entry »