Friday, December 16, 2011

Ahead of Time compilation in Java.

Banal initialization sequences had been a curse on the java runtime. A constant amount of time it takes for booting up the JVM, creating and initializing the vital organs of the virtual machine along with the loading of the kernel classes (such as java/lang/*) used to degrade the startup performance of enterprise applications at large, irrespective of the enterprise and characteristics of the resident application.

While thinking of solutions to reduce startup time, an obvious thought is about dynamic compiler such as JIT, which can intelligently profile the java bytecodes which get executed in this early phase of the JVM life-cycle, and perform native translations on those bytecodes specially, overriding the usual compilation qualifications and policies.

But this technique has its own drawback in that, when you go for optimally compiling the methods which impart in the startup, the compilation effort adds its own overhead to the startup, and the end result is a worsened performance.

Reflect upon this new problem at hand and you get the next obvious solution - fall back to the static compilation for these methods. One can fancy about statically compiling the entire rt.jar (or core.jar) in the target host, before JVM starts, and just use the compiled code at runtime, much in the same manner as the statically compiled executables and libraries.

But static compilation does not fare well with java: Apart from loosing the platform independence (can be solved by compiling in the target host), several powerful optimizations such as virtual method inlining cannot be properly performed because many information which dynamic compiler can obtain at runtime to positively influence the optimization, will be missing at the static compilation time.

AoT is an attempt to address the startup delay in applications, by compiling the method at such an optimization plan where all the statically computable information are utilized up-to the best possible extent, still providing a better performance than the interpreted bytecodes. Specially useful in clustered environments where the methods which are compiled at AoT level in one JVM, can be shared and re-used by other JVMs which take life at later point of time, and boost their startup drastically.

Tuesday, November 22, 2011

Data privacy in Java

Encapsulation is one of the powerful object oriented programming concepts. It is a technique by which member fields (data representative of object) are designated with private access, and harbored by accessor methods.

Encapsulation provides a shield which regulates the data access through getter and setter methods such that the data is tightly associated with the code around it, and the data is protected from cluttered and random accesses from external code - a code which does not belong to the class that encloses declaration of the member field.

But upto what degree this protection is being provided in Java? And in which context this notion of data privacy has to be perceived? Let us examine different scenarios.

1. When a member field is declared as private and flanked by public getter-setter methods, an external code entity can perform every action which an internal method is capable of doing on the field - the private field can be read from, written to, and even purged (nullified). In short, a private field with associated setter-getter methods is as good as a public field in all execution aspects, except in the title bestowed by the language.

2. When a member field is declared as private and covered by a public getter method but not with a setter method, an external code can perform 'most' of the actions which an internal method is capable of doing on the member: the field can be read from. The field, if it is a user defined object, every action which the container object is capable of doing on it, can be performed by external code as well, except for purging (nullifying) the field. Since the getter method returns a copy of the field reference (not a copy of the object), the returned reference also points to the same data in the heap. This means the component reference (private field residing in the container) as well as the returned value of the getter method - both are same in all aspects with respect to the permitted actions, and the underlying data pointed can be modified from outside code as well. Two exceptions here are: i) private primitive fields cannot be modified as the getter method returns a copy of it, not the reference. ii) Nullification of the component reference is not possible, as a copy of the reference is what was returned by the getter method and nullifying the reference nullifies only the copy, not the original.

In short, in both these cases, if the programmer wants to restrict outside code from modifying a private field, the getter method has to clone out a copy of the field and return it.

3. When the field is private and there are no getter setter methods, apparently the object is hidden from outside code, but it is not. Given an object, all of its components can be accessed and modified including purging and cleansing, from an outside java code by making use of reflection APIs. In addition there is undocumented unsafe API collections through which any object references, any objects, any part of the java heap and any part of the process address space can be reached out, with complete capability to read-write. Programmatically, by adhering to the language semantics. This way of data access can be restricted using custom security managers, but they have side effects.

In C++, an object or a primitive field returned through a getter method are cloned copies of the original, so external code is incapable of modifying the shielded object, adhering strictly to the data privacy documented in the language. Only when the getter method returns a object pointer instead of the object itself, the 'intrusion' is possible. The above explained limitation in Java is root caused by the pointer-less design which inhibits the program from flexible object allocation, administration and propagation.

Having said that, it is important to understand the notion of data privacy in java as a mean to design modularized code for better maintainability and as a mild endorsement by the compiler for fending off cluttering and unsolicited data accesses in the code. Encapsulation cannot be used as a mean to achieve data security in the application. Re-usable code modules which impart in enterprise business applications should not rely on this language supported feature for preserving the desired security, instead the data has to be secured through custom means specific to the application.