Tuesday, January 28, 2014

Stack allocation for Java

In my previous post, I mentioned how useful stack allocation would be for Java, and expressed my surprise, bordering on exasperation, for why it is not available.

Since then, I've thought about how such a thing might work, and realized it is impossible to include without either breaking Java's security model, or throwing away any benefits through additional checks.

First, let me describe a possible implementation: We define a new stacknew operator which acts just like the new operator, except that it returns a reference to an object allocated in the current stack frame, instead of on the heap.

Now for a few design choices, what I would choose, and why:

  1. During construction, should all instances of new act like stacknew? No, They should act like normal new. Although it might be convenient for such an automatic conversion to stacknew during construction, it would likely cause problems:
    1. It would encourage the use of stacknew to construct objects not intended for stack-frame lifetime, through behaviors like saving references to themselves in other objects.
    2. It would bring up the question of converting all new instances to stacknew when executed by the object, which would be impossible to get right. Should we convert method code? Static method code? Inherited code from superclasses? Taint all calls made by the object to other objects? It is too messy and not predictable.
    3. Therefore, stacknew must be intended for use on classes specifically designed for it.
  2. Should it be possible to determine if the this object was allocated with stacknew? Yes. This will allow special-purpose code that is stacknew-aware to modify its behavior, using calls to stacknew instead of new when it is stack-allocated. Otherwise it would be difficult to write a general-purpose class without passing around a boolean flag indicating which new operator to use. In fact, Object should have a new method, isStackAllocated(), probably public.
  3. What should happen when dereferencing a reference to a stack-allocated object whose stack frame returned? This should generate a NullPointerException, or similar. Maybe a new Error subclass would be called for. And this is where the trouble starts...

To fit with Java's model of simply disallowing any sort of unsafe memory access, it must not be possible to successfully dereference a reference to a stack-allocated object whose frame has returned. Well, that doesn't sound so hard! I hear people thinking. But it is hard. Surprisingly so.

So how do you know, when you try to dereference a pointer, whether it points to an allocation in an invalid stack frame? You cannot depend on any property of the memory in that frame. Here are the possible scenarios when trying to perform an access:

  1. The object is still there and fine. This is what you would expect during a dereference operation.
  2. The frame returned, but no new frame has overwritten that part of the stack yet. This could be considered lucky, but it would be a nasty bug to fix if something changed.
  3. The frame returned, and a smaller function overwrote part of the memory but not all of it. This would likely produce some very indeterminate behavior; what if you grabbed a reference out of the corrupted object and then tried to dereference that? Some way be valid while others are not.
  4. The frame returned, and other functions completely overwrote the object.

So how do we protect against this? As I said, we cannot trust any of the object's memory. Here is the solution I came up with:

  1. Each stack frame containing any stack-allocated object will have a non-zero cryptographically secure identifier (random number) at a known offset from the beginning of the stack frame.
  2. Every reference will have a type, either heapnew or stacknew. Probably a bit flag somewhere.
  3. Every stacknew reference will contain, in addition to the object pointer, a pointer to the beginning of the stack frame and the value of the stack frame's identifier.
  4. On every dereference, the JVM will check if it is a stacknew reference. If so, it will first verify that the frame still exists before finishing the dereference operation. If not, it will throw an exception or error. This verification is performed by matching the frame identifier in the reference to the value expected by looking at the location where the identifier would be if the frame was still alive.
  5. Any time a stack frame returns, the identifier field in the frame is set to zero.

Why cryptographically secure? Because otherwise it would be theoretically possible for an attacker to guess what the identifier of a frame might be, and arrange to have a specific memory layout occur, resulting in successfully dereferencing a malicious pointer. This scheme makes it vanishingly unlikely to have a false positive match, malicious or otherwise.

I am reasonably confident that this solution will prevent accidental dereferencing of invalid memory, but it does not sound at all efficient, especially when you account for references needing to be three times their current size.

So, to conclude this surprisingly long-winded post, I am saddened to say that I seriously doubt Java could support a stacknew operator.

But that doesn't stop me from wanting one...

No comments: