Java Specialists' Java Training Europehome of the java specialists' newsletter

The Java Specialists' Newsletter
Issue 0582002-10-09 Category: Performance Java version:

GitHub Subscribe Free RSS Feed

Counting bytes on Sockets

by Dr. Heinz M. Kabutz

Welcome to the 58th edition of The Java(tm) Specialists' Newsletter sent to 4814 Java Specialists in 86 countries.

I was quite blown away by your support and help offered after last week's newsletter. All I can say is "thanks". Thank you especially for the encouragement to "carry on" with the newsletters, obviously that was my intention :-)

My mother, my sister and I are now running my father's company, manufacturing drinking straws. For my sister and me it is a part-time occupation, since we both have other occupations. However, we have all been involved with the business for the last 20 years, so fortunately we know what is involved. Don't expect to see less newsletter though - they are still my #1 priority :-) and I will fit them in inbetween writing Java code, training people on Design Patterns and Java, and having staff meetings.

NEW: Please see our new "Extreme Java" course, combining concurrency, a little bit of performance and Java 8. Extreme Java - Concurrency & Performance for Java 8.

Counting bytes on Sockets

Background

At the end of 2001, William Grosso, author of Java RMI, offered to send me a copy of his newly published book. William is a keen reader of our newsletter, and wanted to thank me for publishing it. (I am not hinting that I expect gifts from my readers ;-) Being a technical bookworm, I was quite excited to get my hands on the book, although I was not too excited about the title. How interesting could a book about RMI be?!?

The book arrived, and I started reading it here and there, and I quickly noticed that it was far more than the typical Java books that only contain a beautified view of the Java APIs. It is actually a book about distributed computing, and includes chapters on threads, serialization, and such topics. Infact, only half the book is about RMI, the rest is about writing distributed applications.

I did not have the time to finish reading the book before I went to Germany during March and April 2002. Having two little children to take with, we were pressed for space, so the Java RMI book stayed in South Africa. One of my tasks in Germany was to improve the performance of an application server and naturally, one of the hotspots we looked at was the RMI communication. Remembering Bill's book, I promptly went to a local bookshop and bought another copy (that's how useful I found it!).

One of the things we wanted to look at was the number of bytes transferred over the network from and to the application server. This is where the book came in. It showed me how to specify my own socket factory for RMI and how to link in my own counting stream.

This newsletter is therefore based on ideas gleaned from Java RMI. Some parts of the book are not very advanced, and it has parts of Java in there that do not seem to fit into a book on "Java RMI". Perhaps a different title would have been better. However, there are things in the book that have really helped me improved RMI communication when I did not know where to turn. I therefore recommend the book to both beginners in RMI and those who have already used RMI a bit.

The actual code

When I used these tricks in Germany, I hacked the java.net.Socket class and added support for counting bytes. I did that because it was the easiest way, however, you can get into trouble legally if you ship such a hacked class by mistake. For this newsletter, I wanted to go the "right" way by providing an RMISocketFactory.

The first thing we need if we want to count the bytes is to have two decorators for the input and output streams. To me, just knowing how many bytes were flowing past was not enough. I also wanted to be able to open the bytes in a hex editor (yeah, baby, yeah) to see what bytes were passed around. Please meet, the DebuggingInputStream (lots of applause):

import java.io.*;
import java.net.Socket;

/**
 * This class counts the number of bytes read by it before 
 * passing them on to the next Inputstream in the IO chain.
 * It also dumps the bytes read into a file.
 * Should probably specify a factory for making the file
 * names, however, there is enough stuff to show here without
 * such an extra.
 */
public class DebuggingInputStream extends FilterInputStream {
  // Static data and methods
  private static long totalCount = 0;
  private static long dumpNumber =
    System.currentTimeMillis() / 1000 * 1000;
    
  private synchronized static String makeFileName() {
    return "dump.read." + dumpNumber++ + ".log";
  }
  public synchronized static long getTotalCount() {
    return totalCount;
  }

  // Non-static data and methods
  private final OutputStream copyStream;
  private long count = 0;

  public DebuggingInputStream(Socket socket, InputStream in)
      throws IOException {
    super(in);
    String fileName = makeFileName();
    System.out.println(socket + " -> " + fileName);
    copyStream = new FileOutputStream(fileName);
  }

  public long getCount() {
    return count;
  }

  public int read() throws IOException {
    int result = in.read();
    if (result != -1) {
      synchronized (DebuggingInputStream.class) {
        totalCount++;
      }
      copyStream.write(result);
      count++;
    }
    return result;
  }
  public int read(byte[] b) throws IOException {
    return read(b, 0, b.length);
  }
  public int read(byte[] b, int off, int len)
      throws IOException {
    int length = in.read(b, off, len);
    if (length != -1) {
      synchronized (DebuggingInputStream.class) {
        totalCount += length;
      }
      copyStream.write(b, off, length);
      count += length;
    }
    return length;
  }
  public void close() throws IOException {
    super.close();
    copyStream.close();
  }
}

We have a similar class for the OutputStream. Both these classes contain hardcoded values, like for example the filename generation, the fact that it actually passes the data to a file, etc. Obviously, that is not pretty, and could be refactored.

import java.io.*;
import java.net.Socket;

public class DebuggingOutputStream extends FilterOutputStream {
  // Static data and methods
  private static long totalCount = 0;
  private static long dumpNumber =
    System.currentTimeMillis() / 1000 * 1000;

  private synchronized static String makeFileName() {
    return "dump.written." + dumpNumber++ + ".log";
  }

  public synchronized static long getTotalCount() {
    return totalCount;
  }

  // Non-static data and methods
  private final OutputStream copyStream;
  private long count = 0;

  public DebuggingOutputStream(Socket socket, OutputStream o)
      throws IOException {
    super(o);
    String fileName = makeFileName();
    System.out.println(socket + " -> " + fileName);
    copyStream = new FileOutputStream(fileName);
  }

  public long getCount() {
    return count;
  }

  public void write(int b) throws IOException {
    synchronized (DebuggingOutputStream.class) {
      totalCount++;
    }
    count++;
    out.write(b);
    copyStream.write(b);
  }
  public void write(byte[] b) throws IOException {
    write(b, 0, b.length);
  }
  public void write(byte[] b, int off, int len)
      throws IOException {
    synchronized (DebuggingOutputStream.class) {
      totalCount += len;
    }
    count += len;
    out.write(b, off, len);
    copyStream.write(b, off, len);
  }
  public void close() throws IOException {
    super.close();
    copyStream.close();
  }
  public void flush() throws IOException {
    super.flush();
    copyStream.flush();
  }
}

Next, let us look at our implementation of Socket, called MonitoringSocket. When you look inside java.net.Socket, you can see that all the calls get delegated to a SocketImpl class. The data member inside Socket is called impl and it is package private, meaning that it can be accessed and changed from other classes in the same package. I know what you're thinking - surely that does not happen?! Yes it does - java.net.ServerSocket sometimes sets the impl data member of Socket to null. When we then try and print the socket to the screen in the dump() method, we get a NullPointerException. We therefore have to do some hacking to check whether impl is null and if it is, we skip over it. We still want to keep a handle to that socket, because impl might be set to another value later.

The rest of MonitoringSocket is fairly straight forward. We have a monitoring thread that once every 5 seconds dumps the active sockets. Yes, it is again hard-coded, but this is debugging code, not production code.

We then have a non-static initializer block and two constructors. At compile time, the contents of the non-static initializer blocks are copied into the beginning of the constructors (after the call to super()). We only show the two constructors needed for the socket factories, the no-args constructor and the one taking a hostname as String and the port.

We obviously also override the getInputStream() and getOutputStream() methods to return DebuggingInputStream and DebugggingOutputStream instances respectively.

import java.io.*;
import java.lang.ref.SoftReference;
import java.lang.reflect.Field;
import java.net.*;
import java.util.*;

public class MonitoringSocket extends Socket {
  // keep a list of active sockets, referenced by SoftReference
  private static final List sockets = new LinkedList();

  private static void dump() {
    System.out.println("Socket dump:");
    System.out.println("------------");
    System.out.println("Total bytes"
        + " read=" + DebuggingInputStream.getTotalCount()
        + ", written=" + DebuggingOutputStream.getTotalCount());
    // print all the sockets, and remove them if the Soft
    // Reference has been set to null.
    synchronized (sockets) {
      Iterator it = sockets.iterator();
      while (it.hasNext()) {
        SoftReference ref = (SoftReference)it.next();
        MonitoringSocket socket = (MonitoringSocket)ref.get();
        if (socket == null)
          it.remove();
        else if (!socket.isImplNull())
          System.out.println(socket);
      }
    }
    System.out.println();
  }

  private static Field socket_impl = null;
  static {
    try {
      socket_impl = Socket.class.getDeclaredField("impl");
    } catch (NoSuchFieldException e) {
      throw new RuntimeException();
    }
    socket_impl.setAccessible(true);
  }
  // Sometimes, the Socket.impl data member gets set to null
  // by the ServerSocket.  Yes, it is ugly, but I did not write
  // the java.net.* package ;-)
  private boolean isImplNull() {
    try {
      return null == socket_impl.get(this);
    } catch (Exception ex) {
      return true;
    }
  }

  static {
    new Thread("Socket Monitor") {
      { setDaemon(true); start(); }
      public void run() {
        try {
          while (true) {
            try {
              sleep(5000);
              dump();
            } catch (RuntimeException ex) {
              ex.printStackTrace();
            }
          }
        } catch (InterruptedException e) {} // exit thread
      }
    };
  }

  private DebuggingInputStream din;
  private DebuggingOutputStream dout;

  { // initializer block
    synchronized (sockets) {
      sockets.add(new SoftReference(this));
    }
  }
  public MonitoringSocket() {}
  public MonitoringSocket(String host, int port)
      throws UnknownHostException, IOException {
    super(host, port);
  }

  private long getBytesRead() {
    return din == null ? 0 : din.getCount();
  }

  private long getBytesWritten() {
    return dout == null ? 0 : dout.getCount();
  }

  public synchronized void close() throws IOException {
    synchronized (sockets) {
      Iterator it = sockets.iterator();
      while (it.hasNext()) {
        SoftReference ref = (SoftReference) it.next();
        if (ref.get() == null || ref.get() == this) {
          it.remove();
        }
      }
    }
    super.close();
    if (din != null) { din.close(); din = null; }
    if (dout != null) { dout.close(); dout = null; }
  }
  public InputStream getInputStream() throws IOException {
    if (din != null) return din;
    return din =
      new DebuggingInputStream(this, super.getInputStream());
  }
  public OutputStream getOutputStream() throws IOException {
    if (dout != null) return dout;
    return dout =
      new DebuggingOutputStream(this, super.getOutputStream());
  }
  public String toString() {
      return super.toString()
        + " read=" + getBytesRead()
        + ", written=" + getBytesWritten();
  }
}

The next job is to find all the places in RMI where sockets are created. The most obvious place is in the ServerSocket, so let us change that first:

import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;

public class MonitoringServerSocket extends ServerSocket {
  public MonitoringServerSocket(int port) throws IOException {
    super(port);
  }
  public Socket accept() throws IOException {
    Socket socket = new MonitoringSocket();
    implAccept(socket);
    return socket;
  }
}

Next, we need to tackle the place where RMI creates sockets. RMI provides the ability to specify your own socket factory inside the java.rmi.server.RMISocketFactory class. The default socket factory provided by Sun is the sun.rmi.transport.proxy.RMIMasterSocketFactory class, and contains logic for reusing sockets. It is quite a sophisticated beast, not something that you want to write an ad-hoc implementation for. We could write our own RMISocketFactory to always create a new socket, but then we are not seeing an accurate reflection of what RMI actually does. I found the best approach (besides simply modifying java.net.Socket) is to extend the default socket factory provided by Sun, but there is a catch: Sun's socket factory delegates the actual creation to another instance of RMISocketFactory, i.e. it is just a Decorator for a plain socket factory. The handle to the decorated object is called initialFactory, so what I did was to make that handle point to an instance of RMISocketFactory that created my MonitoringSocket and MonitoringServerSocket classes. There is another catch that I did not address in my code. Sometimes, when you want to speak RMI from behind a firewall, Sun's socket factory creates a socket that can speak over HTTP or CGI interfaces. I do not cover that case, I only cover normal sockets.

import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;
import java.rmi.server.RMISocketFactory;
import sun.rmi.transport.proxy.RMIMasterSocketFactory;

public class MonitoringMasterSocketFactory
    extends RMIMasterSocketFactory {
public MonitoringMasterSocketFactory() {
    initialFactory = new RMISocketFactory() {
      public Socket createSocket(String host, int port)
          throws IOException {
        return new MonitoringSocket(host, port);
      }
      public ServerSocket createServerSocket(int port)
          throws IOException {
        return new MonitoringServerSocket(port);
      }
    };
  }
}

How do you activate this socket factory? At some point at the start of your program, you have to say:

RMISocketFactory.setSocketFactory(
  new MonitoringMasterSocketFactory());

I'll include some sample code for those who need it:

import java.rmi.*;
import java.util.Map;

public interface RMITestI extends Remote {
  String NAME = "rmitest";
  Map getValues(Map old) throws RemoteException;
}

The implementation of that interface is shown here. It is just a dumb example of sending data backwards and forward, where the size of data being passed grows exponentially:

 
import java.io.IOException;
import java.rmi.*;
import java.rmi.server.*;
import java.util.*;

public class RMITest extends UnicastRemoteObject
    implements RMITestI {
  private final Map values = new HashMap();

  public RMITest() throws RemoteException {}

  public Map getValues(Map old) {
    synchronized (values) {
      values.putAll(old);
      return values;
    }
  }

  public static void main(String[] args) throws IOException {
    RMISocketFactory.setSocketFactory(
      new MonitoringMasterSocketFactory());
    System.setSecurityManager(new RMISecurityManager());
    Naming.rebind(RMITestI.NAME, new RMITest());
  }
}

And lastly, some client code that connects to the RMI Server and executes the method a number of times. You can see the data that gets passed backwards and forwards by looking at the dump files.

import java.io.Serializable;
import java.rmi.*;
import java.util.*;

public class RMITestClient {
  public static void main(String args[]) throws Exception {
    System.setSecurityManager(new RMISecurityManager());
    RMITestI test = (RMITestI)Naming.lookup(RMITestI.NAME);
    Map values = new HashMap();
    values.put(new Serializable() {}, "Today");
    for (int i = 0; i < 13; i++) {
      System.out.print('.');
      System.out.flush();
      values.putAll(test.getValues(values));
    }
  }
}

When we run this code, on the server side we can now see the following output:

Socket[addr=cohiba/1.0.0.1,port=1099,localport=2135] read=0, written=0
  -> dump.written.1034160774000.log
Socket[addr=cohiba/1.0.0.1,port=1099,localport=2135] read=0, written=7
  -> dump.read.1034160775000.log
Socket[addr=cohiba/1.0.0.1,port=2137,localport=2134] read=0, written=0
  -> dump.read.1034160775001.log
Socket[addr=cohiba/1.0.0.1,port=2137,localport=2134] read=7, written=0
  -> dump.written.1034160774001.log
Socket dump:
------------
Total bytes read=507, written=539
Socket[addr=cohiba/1.0.0.1,port=2137,localport=2134] read=471, written=301
Socket[addr=cohiba/1.0.0.1,port=1099,localport=2135] read=36, written=238

Socket[addr=cohiba/1.0.0.1,port=2140,localport=2134] read=0, written=0
  -> dump.read.1034160775002.log
Socket[addr=cohiba/1.0.0.1,port=2140,localport=2134] read=7, written=0
  -> dump.written.1034160774002.log
Socket dump:
------------
Total bytes read=460579, written=403442
Socket[addr=cohiba/1.0.0.1,port=2137,localport=2134] read=471, written=301
Socket[addr=cohiba/1.0.0.1,port=1099,localport=2135] read=36, written=238
Socket[addr=cohiba/1.0.0.1,port=2140,localport=2134] read=460072, written=402903

Socket dump:
------------
Total bytes read=652415, written=1052723
Socket[addr=cohiba/1.0.0.1,port=2137,localport=2134] read=471, written=301
Socket[addr=cohiba/1.0.0.1,port=1099,localport=2135] read=36, written=238
Socket[addr=cohiba/1.0.0.1,port=2140,localport=2134] read=651908, written=1052184

Socket dump:
------------
Total bytes read=1702402, written=1091048
Socket[addr=cohiba/1.0.0.1,port=2140,localport=2134] read=1701895, written=1090509

Socket dump:
------------
Total bytes read=1702402, written=2301608
Socket[addr=cohiba/1.0.0.1,port=2140,localport=2134] read=1701895, written=2301069

Socket dump:
------------
Total bytes read=1702402, written=2752353

Socket dump:
------------
Total bytes read=1702402, written=2752353

We can use this MonitoringSocket wherever RMI is used, for example, in EJB servers like JBoss. If the socket factories are too much P.T. for you (P.T. stands for Physical Training, something that computer nerds try to avoid ;-) you can go and hack java.net.Socket and add the code from the MonitoringSocket in there.

I hope this newsletter will be as useful to you as it was to me. If you want to know more about these topics, please look at William Grosso's Java RMI.

A special thanks to William Grosso for all his help in writing custom sockets.

Until the next newsletter ...

Your Java/Design Patterns/Drinking Straw specialist.

Heinz

Performance Articles Related Java Course

Extreme Java - Concurrency and Performance for Java 8
Extreme Java - Advanced Topics for Java 8
Design Patterns
In-House Courses

© 2010-2016 Heinz Kabutz - All Rights Reserved Sitemap
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. JavaSpecialists.eu is not connected to Oracle, Inc. and is not sponsored by Oracle, Inc.