The Java Programming Language
The Core Language
Implementations of the Java programming language provide strong memory safety, even in the presence of data races in concurrent code. This prevents a large range of security vulnerabilities from occurring, unless certain low-level features are used; see Low-level Features of the Virtual Machine.
Increasing Robustness when Reading Arrays
External data formats often include arrays, and the data is stored as an integer indicating the number of array elements, followed by this number of elements in the file or protocol data unit. This length specified can be much larger than what is actually available in the data source.
To avoid allocating extremely large amounts of data, you can
allocate a small array initially and grow it as you read more
data, implementing an exponential growth policy. See the
readBytes(InputStream, int)
function in
Incrementally reading a byte array.
static byte[] readBytes(InputStream in, int length) throws IOException {
final int startSize = 65536;
byte[] b = new byte[Math.min(length, startSize)];
int filled = 0;
while (true) {
int remaining = b.length - filled;
readFully(in, b, filled, remaining);
if (b.length == length) {
break;
}
filled = b.length;
if (length - b.length <= b.length) {
// Allocate final length. Condition avoids overflow.
b = Arrays.copyOf(b, length);
} else {
b = Arrays.copyOf(b, b.length * 2);
}
}
return b;
}
static void readFully(InputStream in,byte[] b, int off, int len)
throws IOException {
int startlen = len;
while (len > 0) {
int count = in.read(b, off, len);
if (count < 0) {
throw new EOFException();
}
off += count;
len -= count;
}
}
When reading data into arrays, hash maps or hash sets, use the default constructor and do not specify a size hint. You can simply add the elements to the collection as you read them.
Resource Management
Unlike C++, Java does not offer destructors which can deallocate resources in a predictable fashion. All resource management has to be manual, at the usage site. (Finalizers are generally not usable for resource management, especially in high-performance code; see Finalizers.)
The first option is the
try
-finally
construct, as
shown in Resource management with a try
-finally
block.
The code in the finally
block should be as short as
possible and should not throw any exceptions.
try
-finally
blockInputStream in = new BufferedInputStream(new FileInputStream(path));
try {
readFile(in);
} finally {
in.close();
}
Note that the resource allocation happens
outside the try
block,
and that there is no null
check in the
finally
block. (Both are common artifacts
stemming from IDE code templates.)
If the resource object is created freshly and implements the
java.lang.AutoCloseable
interface, the code
in Resource management using the try
-with-resource construct can be
used instead. The Java compiler will automatically insert the
close()
method call in a synthetic
finally
block.
try
-with-resource constructtry (InputStream in = new BufferedInputStream(new FileInputStream(path))) {
readFile(in);
}
To be compatible with the try
-with-resource
construct, new classes should name the resource deallocation
method close()
, and implement the
AutoCloseable
interface (the latter breaking
backwards compatibility with Java 6). However, using the
try
-with-resource construct with objects that
are not freshly allocated is at best awkward, and an explicit
finally
block is usually the better approach.
In general, it is best to design the programming interface in
such a way that resource deallocation methods like
close()
cannot throw any (checked or
unchecked) exceptions, but this should not be a reason to ignore
any actual error conditions.
Finalizers
Finalizers can be used a last-resort approach to free resources which would otherwise leak. Finalization is unpredictable, costly, and there can be a considerable delay between the last reference to an object going away and the execution of the finalizer. Generally, manual resource management is required; see Resource Management.
Finalizers should be very short and should only deallocate
native or other external resources held directly by the object
being finalized. In general, they must use synchronization:
Finalization necessarily happens on a separate thread because it is
inherently concurrent. There can be multiple finalization
threads, and despite each object being finalized at most once,
the finalizer must not assume that it has exclusive access to
the object being finalized (in the this
pointer).
Finalizers should not deallocate resources held by other objects, especially if those objects have finalizers on their own. In particular, it is a very bad idea to define a finalizer just to invoke the resource deallocation method of another object, or overwrite some pointer fields.
Finalizers are not guaranteed to run at all. For instance, the virtual machine (or the machine underneath) might crash, preventing their execution.
Objects with finalizers are garbage-collected much later than objects without them, so using finalizers to zero out key material (to reduce its undecrypted lifetime in memory) may have the opposite effect, keeping objects around for much longer and prevent them from being overwritten in the normal course of program execution.
For the same reason, code which allocates objects with
finalizers at a high rate will eventually fail (likely with a
java.lang.OutOfMemoryError
exception) because
the virtual machine has finite resources for keeping track of
objects pending finalization. To deal with that, it may be
necessary to recycle objects with finalizers.
The remarks in this section apply to finalizers which are
implemented by overriding the finalize()
method, and to custom finalization using reference queues.
Recovering from Exceptions and Errors
Java exceptions come in three kinds, all ultimately deriving
from java.lang.Throwable
:
-
Run-time exceptions do not have to be declared explicitly and can be explicitly thrown from any code, by calling code which throws them, or by triggering an error condition at run time, like division by zero, or an attempt at an out-of-bounds array access. These exceptions derive from from the
java.lang.RuntimeException
class (perhaps indirectly). -
Checked exceptions have to be declared explicitly by functions that throw or propagate them. They are similar to run-time exceptions in other regards, except that there is no language construct to throw them (except the
throw
statement itself). Checked exceptions are only present at the Java language level and are only enforced at compile time. At run time, the virtual machine does not know about them and permits throwing exceptions from any code. Checked exceptions must derive (perhaps indirectly) from thejava.lang.Exception
class, but not fromjava.lang.RuntimeException
. -
Errors are exceptions which typically reflect serious error conditions. They can be thrown at any point in the program, and do not have to be declared (unlike checked exceptions). In general, it is not possible to recover from such errors; more on that below, in The Difficulty of Catching Errors. Error classes derive (perhaps indirectly) from
java.lang.Error
, or fromjava.lang.Throwable
, but not fromjava.lang.Exception
.
The general expectation is that run-time errors are avoided by careful programming (e.g., not dividing by zero). Checked exception are expected to be caught as they happen (e.g., when an input file is unexpectedly missing). Errors are impossible to predict and can happen at any point and reflect that something went wrong beyond all expectations.
The Difficulty of Catching Errors
Errors (that is, exceptions which do not (indirectly) derive
from java.lang.Exception
), have the
peculiar property that catching them is problematic. There
are several reasons for this:
-
The error reflects a failed consistenty check, for example,
java.lang.AssertionError
. -
The error can happen at any point, resulting in inconsistencies due to half-updated objects. Examples are
java.lang.ThreadDeath
,java.lang.OutOfMemoryError
andjava.lang.StackOverflowError
. -
The error indicates that virtual machine failed to provide some semantic guarantees by the Java programming language.
java.lang.ExceptionInInitializerError
is an example—it can leave behind a half-initialized class.
In general, if an error is thrown, the virtual machine should be restarted as soon as possible because it is in an inconsistent state. Continuing running as before can have unexpected consequences. However, there are legitimate reasons for catching errors because not doing so leads to even greater problems.
Code should be written in a way that avoids triggering errors. See Increasing Robustness when Reading Arrays for an example.
It is usually necessary to log errors. Otherwise, no trace of
the problem might be left anywhere, making it very difficult
to diagnose related failures. Consequently, if you catch
java.lang.Exception
to log and suppress all
unexpected exceptions (for example, in a request dispatching
loop), you should consider switching to
java.lang.Throwable
instead, to also cover
errors.
The other reason mainly applies to such request dispatching loops: If you do not catch errors, the loop stops looping, resulting in a denial of service.
However, if possible, catching errors should be coupled with a way to signal the requirement of a virtual machine restart.
Low-level Features of the Virtual Machine
Reflection and Private Parts
The setAccessible(boolean)
method of the
java.lang.reflect.AccessibleObject
class
allows a program to disable language-defined access rules for
specific constructors, methods, or fields. Once the access
checks are disabled, any code can use the
java.lang.reflect.Constructor
,
java.lang.reflect.Method
, or
java.lang.reflect.Field
object to access the
underlying Java entity, without further permission checks. This
breaks encapsulation and can undermine the stability of the
virtual machine. (In contrast, without using the
setAccessible(boolean)
method, this should
not happen because all the language-defined checks still apply.)
This feature should be avoided if possible.
Java Native Interface (JNI)
The Java Native Interface allows calling from Java code functions specifically written for this purpose, usually in C or C++.
The transition between the Java world and the C world is not fully type-checked, and the C code can easily break the Java virtual machine semantics. Therefore, extra care is needed when using this functionality.
To provide a moderate amount of type safety, it is recommended
to recreate the class-specific header file using
javah during the build process,
include it in the implementation, and use the
-Wmissing-declarations
option.
Ideally, the required data is directly passed to static JNI methods and returned from them, and the code and the C side does not have to deal with accessing Java fields (or even methods).
When using GetPrimitiveArrayCritical
or
GetStringCritical
, make sure that you only
perform very little processing between the get and release
operations. Do not access the file system or the network, and
not perform locking, because that might introduce blocking.
When processing large strings or arrays, consider splitting the
computation into multiple sub-chunks, so that you do not prevent
the JVM from reaching a safepoint for extended periods of time.
If necessary, you can use the Java long
type
to store a C pointer in a field of a Java class. On the C side,
when casting between the jlong
value and the
pointer on the C side,
You should not try to perform pointer arithmetic on the Java
side (that is, you should treat pointer-carrying
long
values as opaque). When passing a slice
of an array to the native code, follow the Java convention and
pass it as the base array, the integer offset of the start of
the slice, and the integer length of the slice. On the native
side, check the offset/length combination against the actual
array length, and use the offset to compute the pointer to the
beginning of the array.
JNIEXPORT jint JNICALL Java_sum
(JNIEnv *jEnv, jclass clazz, jbyteArray buffer, jint offset, jint length)
{
assert(sizeof(jint) == sizeof(unsigned));
if (offset < 0 || length < 0) {
(*jEnv)->ThrowNew(jEnv, arrayIndexOutOfBoundsExceptionClass,
"negative offset/length");
return 0;
}
unsigned uoffset = offset;
unsigned ulength = length;
// This cannot overflow because of the check above.
unsigned totallength = uoffset + ulength;
unsigned actuallength = (*jEnv)->GetArrayLength(jEnv, buffer);
if (totallength > actuallength) {
(*jEnv)->ThrowNew(jEnv, arrayIndexOutOfBoundsExceptionClass,
"offset + length too large");
return 0;
}
unsigned char *ptr = (*jEnv)->GetPrimitiveArrayCritical(jEnv, buffer, 0);
if (ptr == NULL) {
return 0;
}
unsigned long long sum = 0;
for (unsigned char *p = ptr + uoffset, *end = p + ulength; p != end; ++p) {
sum += *p;
}
(*jEnv)->ReleasePrimitiveArrayCritical(jEnv, buffer, ptr, 0);
return sum;
}
In any case, classes referring to native resources must be
declared final
, and must not be serializeable
or clonable. Initialization and mutation of the state used by
the native side must be controlled carefully. Otherwise, it
might be possible to create an object with inconsistent native
state which results in a crash (or worse) when used (or perhaps
only finalized) later. If you need both Java inheritance and
native resources, you should consider moving the native state to
a separate class, and only keep a reference to objects of that
class. This way, cloning and serialization issues can be
avoided in most cases.
If there are native resources associated with an object, the class should have an explicit resource deallocation method (Resource Management) and a finalizer (Finalizers) as a last resort. The need for finalization means that a minimum amount of synchronization is needed. Code on the native side should check that the object is not in a closed/freed state.
Many JNI functions create local references. By default, these
persist until the JNI-implemented method returns. If you create
many such references (e.g., in a loop), you may have to free
them using DeleteLocalRef
, or start using
PushLocalFrame
and
PopLocalFrame
. Global references must be
deallocated with DeleteGlobalRef
, otherwise
there will be a memory leak, just as with
malloc
and free
.
When throwing exceptions using Throw
or
ThrowNew
, be aware that these functions
return regularly. You have to return control manually to the
JVM.
Technically, the JNIEnv
pointer is not
necessarily constant during the lifetime of your JNI module.
Storing it in a global variable is therefore incorrect.
Particularly if you are dealing with callbacks, you may have to
store the pointer in a thread-local variable (defined with
__thread
). It is, however, best to avoid the
complexity of calling back into Java code.
Keep in mind that C/C and Java are different languages,
despite very similar syntax for expressions. The Java memory
model is much more strict than the C or C memory models, and
native code needs more synchronization, usually using JVM
facilities or POSIX threads mutexes. Integer overflow in Java
is defined, but in C/C++ it is not (for the
jint
and jlong
types).
Interacting with the Security Manager
The Java platform is largely implemented in the Java language itself. Therefore, within the same JVM, code runs which is part of the Java installation and which is trusted, but there might also be code which comes from untrusted sources and is restricted by the Java sandbox (to varying degrees). The security manager draws a line between fully trusted, partially trusted and untrusted code.
The type safety and accessibility checks provided by the Java
language and JVM would be sufficient to implement a sandbox.
However, only some Java APIs employ such a capabilities-based
approach. (The Java SE library contains many public classes with
public constructors which can break any security policy, such as
java.io.FileOutputStream
.) Instead, critical
functionality is protected by stack
inspection: At a security check, the stack is walked
from top (most-nested) to bottom. The security check fails if a
stack frame for a method is encountered whose class lacks the
permission which the security check requires.
This simple approach would not allow untrusted code (which lacks certain permissions) to call into trusted code while the latter retains trust. Such trust transitions are desirable because they enable Java as an implementation language for most parts of the Java platform, including security-relevant code. Therefore, there is a mechanism to mark certain stack frames as trusted (Re-gaining Privileges).
In theory, it is possible to run a Java virtual machine with a security manager that acts very differently from this approach, but a lot of code expects behavior very close to the platform default (including many classes which are part of the OpenJDK implementation).
Security Manager Compatibility
A lot of code can run without any additional permissions at all, with little changes. The following guidelines should help to increase compatibility with a restrictive security manager.
-
When retrieving system properties using
System.getProperty(String)
or similar methods, catchSecurityException
exceptions and treat the property as unset. -
Avoid unnecessary file system or network access.
-
Avoid explicit class loading. Access to a suitable class loader might not be available when executing as untrusted code.
If the functionality you are implementing absolutely requires privileged access and this functionality has to be used from untrusted code (hopefully in a restricted and secure manner), see Re-gaining Privileges.
Activating the Security Manager
The usual command to launch a Java application,
java
, does not activate the security manager.
Therefore, the virtual machine does not enforce any sandboxing
restrictions, even if explicitly requested by the code (for
example, as described in Reducing Trust in Code).
The -Djava.security.manager
option activates
the security manager, with the fairly restrictive default
policy. With a very permissive policy, most Java code will run
unchanged. Assuming the policy in Most permissve OpenJDK policy file
has been saved in a file grant-all.policy
,
this policy can be activated using the option
-Djava.security.policy=grant-all.policy
(in
addition to the -Djava.security.manager
option).
grant {
permission java.security.AllPermission;
};
With this most permissive policy, the security manager is still active, and explicit requests to drop privileges will be honored.
Reducing Trust in Code
The Using the security manager to run code with reduced privileges example shows how to run a piece code of with reduced privileges.
Permissions permissions = new Permissions();
ProtectionDomain protectionDomain =
new ProtectionDomain(null, permissions);
AccessControlContext context = new AccessControlContext(
new ProtectionDomain[] { protectionDomain });
// This is expected to succeed.
try (FileInputStream in = new FileInputStream(path)) {
System.out.format("FileInputStream: %s%n", in);
}
AccessController.doPrivileged(new PrivilegedExceptionAction<Void>() {
@Override
public Void run() throws Exception {
// This code runs with reduced privileges and is
// expected to fail.
try (FileInputStream in = new FileInputStream(path)) {
System.out.format("FileInputStream: %s%n", in);
}
return null;
}
}, context);
The example above does not add any additional permissions to the
permissions
object. If such permissions are
necessary, code like the following (which grants read permission
on all files in the current directory) can be used:
permissions.add(new FilePermission(
System.getProperty("user.dir") + "/-", "read"));
Calls to the
The example code above does not prevent the called code from
calling the
The |
For activating the security manager, see Activating the Security Manager. Unfortunately, this affects the virtual machine as a whole, so it is not possible to do this from a library.
Re-gaining Privileges
Ordinarily, when trusted code is called from untrusted code, it
loses its privileges (because of the untrusted stack frames
visible to stack inspection). The
java.security.AccessController.doPrivileged()
family of methods provides a controlled backdoor from untrusted
to trusted code.
By design, this feature can undermine the Java security model and the sandbox. It has to be used very carefully. Most sandbox vulnerabilities can be traced back to its misuse. |
In essence, the doPrivileged()
methods
cause the stack inspection to end at their call site. Untrusted
code further down the call stack becomes invisible to security
checks.
The following operations are common and safe to perform with elevated privileges.
-
Reading custom system properties with fixed names, especially if the value is not propagated to untrusted code. (File system paths including installation paths, host names and user names are sometimes considered private information and need to be protected.)
-
Reading from the file system at fixed paths, either determined at compile time or by a system property. Again, leaking the file contents to the caller can be problematic.
-
Accessing network resources under a fixed address, name or URL, derived from a system property or configuration file, information leaks not withstanding.
The Using the security manager to run code with increased privileges example shows how to request additional privileges.
// This is expected to fail.
try {
System.out.println(System.getProperty("user.home"));
} catch (SecurityException e) {
e.printStackTrace(System.err);
}
AccessController.doPrivileged(new PrivilegedAction<Void>() {
public Void run() {
// This should work.
System.out.println(System.getProperty("user.home"));
return null;
}
});
Obviously, this only works if the class containing the call to
doPrivileged()
is marked trusted (usually
because it is loaded from a trusted class loader).
When writing code that runs with elevated privileges, make sure that you follow the rules below.
-
Make the privileged code as small as possible. Perform as many computations as possible before and after the privileged code section, even if it means that you have to define a new class to pass the data around.
-
Make sure that you either control the inputs to the privileged code, or that the inputs are harmless and cannot affect security properties of the privileged code.
-
Data that is returned from or written by the privileged code must either be restricted (that is, it cannot be accessed by untrusted code), or must be harmless. Otherwise, privacy leaks or information disclosures which affect security properties can be the result.
If the code calls back into untrusted code at a later stage (or performs other actions under control from the untrusted caller), you must obtain the original security context and restore it before performing the callback, as in Restoring privileges when invoking callbacks. (In this example, it would be much better to move the callback invocation out of the privileged code section, of course.)
interface Callback<T> {
T call(boolean flag);
}
class CallbackInvoker<T> {
private final AccessControlContext context;
Callback<T> callback;
CallbackInvoker(Callback<T> callback) {
context = AccessController.getContext();
this.callback = callback;
}
public T invoke() {
// Obtain increased privileges.
return AccessController.doPrivileged(new PrivilegedAction<T>() {
@Override
public T run() {
// This operation would fail without
// additional privileges.
final boolean flag = Boolean.getBoolean("some.property");
// Restore the original privileges.
return AccessController.doPrivileged(
new PrivilegedAction<T>() {
@Override
public T run() {
return callback.call(flag);
}
}, context);
}
});
}
}