Wednesday, January 22, 2014

Java Concurrency in Practice



They share process-wide resources such as memory and file handles, but each thread has its own program counter, stack, and local variables.
multiple threads within the same program can be scheduled simultaneously on multiple CPUs.
But without explicit synchronization to coordinate access to shared data, a thread may modify variables that another thread
is in the middle of using, with unpredictable results.
Threads are useful in GUI applications for improving the responsiveness of the user interface, and in server applications
for improving resource utilization and throughput.
GUI applications are inherently asynchronous. Swing and AWT address this problem by creating a separate thread for handling user-initiated
events and updating the graphical view presented to the user. Swing programs achieve their thread safety by confining all access to
GUI components to the event thread. If an application wants to manipulate the GUI from outside the event thread, it must cause the code
that will manipulate the GUI to run in the event thread instead.

If multiple threads access the same mutable state variable without appropriate synchronization, your program is broken. There are three ways to fix it:
Don't share the state variable across threads;
Make the state variable immutable; or
Use synchronization whenever accessing the state variable.
It is far easier to design a class to be thread-safe than to retrofit it for thread safety later.
concurrency bugs are so difficult to reproduce and debug, the benefit of a small performance gain on some infrequently used code path may
well be dwarfed by the risk that the program will fail in the field.
a class is thread-safe when it continues to behave correctly when accessed from multiple threads.
Stateless objects are always thread-safe.
A race condition occurs when getting the right answer relies on lucky timing. (count++, meet friends at Starbucks, Lazy Initialization)
To ensure thread safety, check-then-act operations (like lazy initialization) and read-modify-write operations (like increment) must always be atomic.

@ThreadSafe
public class CountingFactorizer implements Servlet {
    private final AtomicLong count = new AtomicLong(0);

    public long getCount() { return count.get(); }

    public void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        count.incrementAndGet();
        encodeIntoResponse(resp, factors);
    }
}

Where practical, use existing thread-safe objects, like AtomicLong, to manage your class's state. this makes it easier to maintain and verify thread safety.

A synchronized block has two parts: a reference to an object that will serve as the lock, and a block of code to be guarded by that lock.
A synchronized method is a shorthand for a synchronized block that spans an entire method body, and whose lock is the object on which
the method is being invoked. (Static synchronized methods use the Class object for the lock.)
intrinsic locks are reentrant, if a thread tries to acquire a lock that it already holds, the request succeeds.
Reentrancy is implemented by associating with each lock an acquisition count and an owning thread.
if synchronization is used to coordinate access to a variable, it is needed everywhere that variable is accessed. Further,
when using locks to coordinate access to a variable, the same lock must be used wherever that variable is accessed.
Code auditing tools like FindBugs can identify when a variable is frequently but not always accessed with a lock held, which may indicate a bug.

Avoid holding locks during lengthy computations or operations at risk of not completing quickly such as network or console I/O.

writing correct concurrent programs is primarily about managing access to shared, mutable state.
always use the proper synchronization whenever data is shared across threads.
Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up-to-date
values of shared mutable variables, the reading and writing threads must synchronize on a common lock.
a read of a volatile variable always returns the most recent write by any thread.
Locking can guarantee both visibility and atomicity; volatile variables can only guarantee visibility. (
the semantics of volatile are not strong enough to make the increment operation (count++) atomic
)
When an object creates a thread from its constructor, it almost always shares its this reference with the new thread, either explicitly
(by passing it to the constructor) or implicitly (because the Thread or Runnable is an inner class of the owning object). The new
thread might then be able to see the owning object before it is fully constructed. There's nothing wrong with creating a thread
in a constructor, but it is best not to start the thread immediately.

If data is only accessed from a single thread, no synchronization is needed. This technique, thread confinement,
is one of the simplest ways to achieve thread safety. Swing uses thread confinement extensively. The Swing visual components and data model
objects are not thread safe; instead, safety is achieved by confining them to the Swing event dispatch thread.
To use Swing properly, code running in threads other than the event thread should not access these objects.
(To make this easier, Swing provides the invokeLater mechanism to schedule a Runnable for execution in the event thread.)

Thread-Local provides get and set access or methods that maintain a separate copy of the value for each thread that uses it,
so a get returns the most recent value passed to set from the currently executing thread.

private static ThreadLocal<Connection> connectionHolder
    = new ThreadLocal<Connection>() {
        public Connection initialValue() {
            return DriverManager.getConnection(DB_URL);
        }
    };

public static Connection getConnection() {
    return connectionHolder.get();
}

Immutable objects are always thread-safe.
Immutable objects are simple. They can only be in one state, which is carefully controlled by the constructor.
An object whose fields are all final may still be mutable, since final fields can hold references to mutable objects.
Final fields can't be modified (although the objects they refer to can be modified if they are mutable)
Just as it is a good practice to make all fields private unless they need greater visibility [EJ Item 12], it is a good practice to make
all fields final unless they need to be mutable.

The synchronized collections (wrapper by the Collections.synchronizedXxx factory methods) are thread-safe. With a synchronized collection, these
compound actions are still technically thread-safe even without client-side locking, but they may not behave as you might expect when other
threads can concurrently modify the collection.

public static Object getLast(Vector list) {
    int lastIndex = list.size() - 1;
    return list.get(lastIndex);
}

public static void deleteLast(Vector list) {
    int lastIndex = list.size() - 1;
    list.remove(lastIndex);
}

To:

public static Object getLast(Vector list) {
    synchronized (list) {
        int lastIndex = list.size() - 1;
        return list.get(lastIndex);
    }
}

public static void deleteLast(Vector list) {
    synchronized (list) {
        int lastIndex = list.size() - 1;
        list.remove(lastIndex);
    }
}

List<Widget> widgetList
    = Collections.synchronizedList(new ArrayList<Widget>());
...
// May throw ConcurrentModificationException
for (Widget w : widgetList)
    doSomething(w);

An alternative to locking the collection during iteration is to clone the collection and iterate the copy instead.
Replacing synchronized collections with concurrent collections can offer dramatic scalability improvements with little risk.
Java 5.0 adds ConcurrentHashMap, a replacement for synchronized hash-based Map implementations, and CopyOnWriteArrayList,
a replacement for synchronized List implementations for cases where traversal is the dominant operation.

BlockingQueue extends Queue to add blocking insertion and retrieval operations. If the queue is empty, a retrieval blocks until
an element is available, and if the queue is full (for bounded queues) an insertion blocks until there is space available.
Blocking queues are extremely useful in producer-consumer designs

Just as ConcurrentHashMap is a concurrent replacement for a synchronized hash-based Map, Java 6 adds ConcurrentSkipListMap
and ConcurrentSkipListSet, which are concurrent replacements for a synchronized SortedMap or SortedSet (such as treeMap
or TReeSet wrapped with synchronizedMap).

ConcurrentHashMap, along with the other concurrent collections, further improve on the synchronized collection classes by providing
iterators that do not throw ConcurrentModificationException, thus eliminating the need to lock the collection during iteration.
The iterators returned by ConcurrentHashMap are weakly consistent instead of fail-fast. A weakly consistent iterator can
tolerate concurrent modification, traverses elements as they existed when the iterator was constructed, and may (but is not
guaranteed to) reflect modifications to the collection after the construction of the iterator.

Since the result of size could be out of date by the time it is computed, it is really only an estimate, so size
is allowed to return an approximation instead of an exact count.

The one feature offered by the synchronized Map implementations but not by ConcurrentHashMap is the ability to lock the map
 for exclusive access. With Hashtable and synchronizedMap, acquiring the Map lock prevents any other thread from accessing it.
Because it has so many advantages and so few disadvantages compared to Hashtable or synchronizedMap, replacing synchronized Map
implementations with ConcurrentHashMap in most cases results only in better scalability. Only if your application needs to
lock the map for exclusive access [3] is ConcurrentHashMap not an appropriate drop-in replacement.

CopyOnWriteArrayList is a concurrent replacement for a synchronized List that offers better concurrency in some common situations
and eliminates the need to lock or copy the collection during iteration. (Similarly, CopyOnWriteArraySet is a concurrent
replacement for a synchronized Set.) They implement mutability by creating and republishing a new copy of the collection every time it is modified.

Obviously, there is some cost to copying the backing array every time the collection is modified, especially if the collection is large; the copy-on-write
collections are reasonable to use only when iteration is far more common than modification. This criterion exactly describes many
event-notification systems: delivering a notification requires iterating the list of registered listeners and calling each
one of them, and in most cases registering or unregistering an event listener is far less common than receiving an event notification.

Java 6 also adds another two collection types, Deque (pronounced "deck") and BlockingDeque, that extend Queue and BlockingQueue.
A Deque is a doubleended queue that allows efficient insertion and removal from both the head and the tail. Implementations
include ArrayDeque and LinkedBlockingDeque.

A producerconsumer design has one shared work queue for all consumers; in a work stealing design, every consumer has its own deque.
If a consumer exhausts the work in its own deque, it can steal work from the tail of someone else's deque. Work stealing can
be more scalable than a traditional producer-consumer design because workers don't contend for a shared work queue; most of
the time they access only their own deque, reducing contention. When a worker has to access another's queue, it does so from
the tail rather than the head, further reducing contention.

a blocked thread must wait for an event that is beyond its control before it can proceed.
The put and take methods of BlockingQueue throw the checked InterruptedException, as do a number of other library methods such as
Thread.sleep. When a method can throw InterruptedException, it is telling you that it is a blocking method, and further that
if it is interrupted, it will make an effort to stop blocking early.

Interruption is a cooperative mechanism. One thread cannot force another to stop what it is doing and do something else;
when thread A interrupts thread B, A is merely requesting that B stop what it is doing when it gets to a convenient
stopping pointif it feels like it. While there is nothing in the API or language specification that demands any specific
application-level semantics for interruption, the most sensible use for interruption is to cancel an activity. Blocking
methods that are responsive to interruption make it easier to cancel long-running activities on a timely basis.

When your code calls a method that throws InterruptedException, then your method is a blocking method too, and
must have a plan for responding to interruption.
For library code, there are basically two choices:
Propagate the InterruptedException. This is often the most sensible policy if you can get away with itjust propagate the InterruptedException
to your caller. This could involve not catching InterruptedException, or catching it and throwing it again after performing some brief
activity-specific cleanup.
Restore the interrupt. Sometimes you cannot throw InterruptedException, for instance when your code is part of a Runnable. In these
situations, you must catch InterruptedException and restore the interrupted status by calling interrupt on the current thread,
so that code higher up the call stack can see that an interrupt was issued
public class TaskRunnable implements Runnable {
    BlockingQueue<Task> queue;
    ...
    public void run() {
        try {
            processTask(queue.take());
        } catch (InterruptedException e) {
             // restore interrupted status
             Thread.currentThread().interrupt();
        }
    }
}


A synchronizer is any object that coordinates the control flow of threads based on its state. Blocking queues can act as synchronizers;
other types of synchronizers include semaphores, barriers, and latches.

A latch acts as a gate: until the latch reaches the terminal state the gate is closed and no thread can pass, and in the terminal state
the gate opens, allowing all threads to pass. Once the latch reaches the terminal state, it cannot change state again, so it remains
open forever. Latches can be used to ensure that certain activities do not proceed until other one-time activities complete
CountDownLatch is a flexible latch implementation that can be used in any of these situations; it allows one or more threads to
wait for a set of events to occur. The latch state consists of a counter initialized to a positive number, representing
the number of events to wait for. The countDown method decrements the counter, indicating that an event has occurred, and
the await methods wait for the counter to reach zero, which happens when all the events have occurred. If the counter is nonzero
on entry, await blocks until the counter reaches zero, the waiting thread is interrupted, or the wait times out.

FutureTask also acts like a latch. (FutureTask implements Future, which describes an abstract result-bearing computation [CPJ 4.3.3].)
A computation represented by a FutureTask is implemented with a Callable, the result-bearing equivalent of Runnable, and can be in
one of three states: waiting to run, running, or completed. Completion subsumes all the ways a computation can complete, including
normal completion, cancellation, and exception. Once a FutureTask enters the completed state, it stays in that state forever.

FutureTask is used by the Executor framework to represent asynchronous tasks, and can also be used to represent any potentially
lengthy computation that can be started before the results are needed.

Semaphores are useful for implementing resource pools such as database connection pools. While it is easy to construct a fixed-sized pool
that fails if you request a resource from an empty pool, what you really want is to block if the pool is empty and unblock when it
becomes nonempty again. If you initialize a Semaphore to the pool size, acquire a permit before trying to fetch a resource from the
pool, and release the permit after putting a resource back in the pool, acquire blocks until the pool becomes nonempty.

Barriers are similar to latches in that they block a group of threads until some event has occurred [CPJ 4.4.3]. The key difference is that
with a barrier, all the threads must come together at a barrier point at the same time in order to proceed. Latches are for waiting
for events; barriers are for waiting for other threads.

CyclicBarrier allows a fixed number of parties to rendezvous repeatedly at a barrier point and is useful in parallel iterative algorithms
that break down a problem into a fixed number of independent subproblems. Threads call await when they reach the barrier point,
and await blocks until all the threads have reached the barrier point. If all threads meet at the barrier point, the barrier has
been successfully passed, in which case all threads are released and the barrier is reset so it can be used again.

When there are more runnable threads than available processors, threads sit idle. Having many idle threads can tie up a lot of memory,
putting pressure on the garbage collector, and having many threads competing for the CPUs can impose other performance costs as well.
java.util.concurrent provides a flexible thread pool implementation as part of the Executor framework. The primary abstraction for
task execution in the Java class libraries is not Thread, but Executor

Whenever you see code of the form:
new Thread(runnable).start()
and you think you might at some point want a more flexible execution policy, seriously consider replacing it with the use of an Executor.
An Executor implementation is likely to create threads for processing tasks. But the JVM can't exit until all the (nondaemon)
threads have terminated, so failing to shut down an Executor could prevent the JVM from exiting.
the ExecutorService interface extends Executor, adding a number of methods for lifecycle management (as well as some convenience
methods for task submission).
Timer has some drawbacks, and ScheduledThreadPoolExecutor should be thought of as its replacement
Runnable is a fairly limiting abstraction; run cannot return a value or throw checked exceptions
To express a non-value-returning task with Callable, use Callable<Void>.
In the Executor framework, tasks that have been submitted but not yet started can always be cancelled, and tasks that
have started can sometimes be cancelled if they are responsive to interruption.
Future represents the lifecycle of a task and provides methods to test whether the task has completed or been cancelled,
retrieve its result, and cancel the task.
The submit methods in ExecutorService all return a Future, so that you can submit a Runnable or a Callable to an executor and
get back a Future that can be used to retrieve the result or cancel the task.
CompletionService combines the functionality of an Executor and a BlockingQueue. You can submit Callable tasks to it for execution
and use the queuelike methods take and poll to retrieve completed results

If you've tried to write even a simple GUI application using Swing, you know that GUI applications have their own peculiar
threading issues. To maintain safety, certain tasks must run in the Swing event thread. But you cannot execute longrunning tasks
in the event thread, lest the UI become unresponsive. And Swing data structures are not thread-safe, so you must be careful to
confine them to the event thread.

Nearly all GUI toolkits, including Swing and SWT, are implemented as singlethreaded subsystems in which all GUI activity is confined to a single thread.
Single-threaded GUI frameworks achieve thread safety via thread confinement; all GUI objects, including visual components and data models, are
accessed exclusively from the event thread.
Because there is only a single thread for processing GUI tasks, they are processed sequentiallyone task finishes before the next one
begins, and no two tasks overlap. Knowing this makes writing task code easieryou don't have to worry about interference from other tasks.
tasks that execute in the event thread must return control to the event thread quickly.
To initiate a longrunning task you must run that task in another thread so control can return quickly to the event thread.
To update a progress indicator while a long-running task executes or provide visual feedback when it completes, you again need to
execute code in the event thread. This can get complicated quickly.
The Swing single-thread rule: Swing components and models should be created, modified, and queried only from the event-dispatching thread.
SwingUtilities.invokeLater, which schedules a Runnable for execution on the event thread (callable from any thread);
SwingUtilities.invokeAndWait, which schedules a Runnable task for execution on the event thread and blocks the current thread until it
completes (callable only from a non-GUI thread);

A program will be free of lock-ordering deadlocks if all threads acquire the locks they need in a fixed global order.
Invoking an alien method with a lock held is asking for liveness trouble. The alien method might acquire other locks (risking deadlock)
or block for an unexpectedly long time, stalling other threads that need the lock you hold.
Strive to use open calls throughout your program. Programs that rely on open calls are far easier to analyze for deadlock-freedom
than those that allow calls to alien methods with locks held
Tasks that wait for the results of other tasks are the primary source of thread-starvation deadlock;
A program that never acquires more than one lock at a time cannot experience lock-ordering deadlock.
try to minimize the number of potential locking interactions, and follow and document a lock-ordering protocol for locks that may be acquired together.
Using open calls wherever possible.
Where intrinsic locks wait forever if they cannot acquire the lock, explicit locks let you specify a timeout after which tryLock returns failure. By using a
timeout that is much longer than you expect acquiring the lock to take, you can regain control when something unexpected happens.

To trigger a thread dump, you can send the JVM process a SIGQUIT signal (kill -3) on Unix platforms, or press the Ctrl-\ key on Unix or
Ctrl-Break on Windows platforms.
There's nothing in the JDBC specification that requires a Connection to be thread-safe, and it is common to confine use of a Connection
to a single thread, as was intended here.

Starvation occurs when a thread is perpetually denied access to resources it needs in order to make progress;
Improving performance means doing more work with fewer resources.
The major challenge in constructing tests for concurrent programs is that potential failures may be rare probabalistic occurrences rather than
deterministic ones; tests that disclose such failures must be more extensive and run for longer than typical sequential tests.

Unfortunately, test code can introduce timing or synchronization artifacts that can mask bugs that might otherwise manifest themselves

Done.



No comments:

Post a Comment