Abstract

A Comparison of Three Approaches to Language, Compiler, and Library Support for Multidimensional Arrays in Java
Jose Moreira - IBM T. J. Watson Research Center
Sam Midkiff - IBM T. J. Watson Research Center
Manish Gupta - IBM T. J. Watson Research Center
The lack of direct support for multidimensional arrays in Java(TM} 
has been recognized as a major deficiency in the language's 
applicability to numerical computing.  The typical approach to adding
multidimensional arrays to Java has been through class libraries that
implement these structures.  It has been shown that the class library
approach can achieve very high-performance for numerical computing,
through the use of compiler techniques and efficient
implementations of aggregate array operations. Because of the
inconvenience of accessing array elements through method invocations,
it is advocated by many that class libraries for multidimensional
arrays should be combined with new language syntax to facilitate
manipulation of those multidimensional arrays.  Another approach that
has been discussed in the literature is that of relying exclusively on
the JVM to recognize those arrays of arrays that are being used to
simulate multidimensional arrays.  This approach can also deliver good
performance, but it does not improve the existing interfaces for
numerical computing.  There is yet a third approach: extending
the Java language with new syntactic constructs for multidimensional
arrays and directly compiling those constructs to bytecode.  The new
constructs provide a more convenient interface for numerical
computing, without requiring a matching class library.  This paper is
a comparative discussion of the three approaches to adding
multidimensional arrays to Java mentioned above. We present a
description of the three approaches, listing the pros and cons of
each.  We give a more detailed description of the third approach --
language constructs translated to bytecode -- as it is a new
contribution. We compare each of the approaches with regards to
functionality, impact on the language and virtual machine
specification, implementation
efforts, and typical achievable performance. We show that the best choice
depends on the relative importance attached to the above metrics.