|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TOP THREE LINKS YOU MUST CLICK ON General Java Benchmarking with an Abstract Class
Benchmarking with an Abstract Class
By: Steven Feurstein
Jul. 1, 1999 12:00 AM
I've spent over a decade working with Oracle technology to develop and deploy applications. In the process I've developed an area of expertise: the Oracle PL/SQL language. PL/SQL is a flexible, powerful procedural database programming language. There is no doubt that if you want to interact with the Oracle database, PL/SQL is the way to go. It has even adopted some object-oriented capabilities in Oracle8. With Oracle's decision to integrate a Java Virtual Machine directly into its database, however, it's incumbent upon all Oracle developers to learn the Java language. Most important, we need to figure out how to use both Java and PL/SQL within the Oracle server technology to get the best of both worlds. I'm busy learning Java and am both delighted and dismayed by the features of this most modern object-oriented language. I recently spent some time wrestling with and figuring out how to use abstract classes. This article shares the lessons I learned; my hope is that developers new to Java will be able to understand more easily this important but somewhat obscure facet of Java.
Analyzing Program Performance
One of the handiest utilities I constructed in PL/SQL was the PLVtmr package (packages in PL/SQL are similar to static classes in Java). Developers use this package's programs (PLVtmr.capture and PLVtmr.showelapsed) to calculate the elapsed time of a particular program or code segment. This timer utility proved handy because it offered a granularity of timing (down to the nearest hundredth of a second) and focus. Rather than run an elaborate analysis/trace session, I could quickly determine the performance of a single program. I could also compare different implementations of a given requirement to find the version with optimal performance. I decided to build a mechanism in Java similar to the PLVtmr package. After a small amount of research, I discovered that the System.currentTimeMillis method returns the current time as the number of milliseconds since January 1, 1970 00:00:00. And just in case you were concerned about Y2K problems in Java, rest assured that System.currentTimeMillis will not overflow till the year 292280995. (I joke with PL/SQL students that the authors of Java experienced a bout of "language envy." PL/SQL calculates elapsed time to the hundredth of a second, so naturally Java must go down to the thousandth of a second. In reality, of course, I imagine that Gosling and others didn't pay a whole lot of attention to Oracle PL/SQL as they created their own "Write once, run everywhere" language.) Great! So I have a mechanism that provides the time to the nearest millisecond. How do I use that method to calculate the elapsed time of a program? By comparing consecutive calls to the method. Listing 1 shows a very simple class that demonstrates the technique by timing how long it takes to count the number of bytes in the specified file. The first time I executed HowFast.main, 70 milliseconds elapsed. Subsequent executions revealed a steady-state elapsed time of 40 milliseconds (see Figure 1). Well, if you hadn't previously known about System.currentTimeMillis before picking up this article, you've now learned something new about Java! You've also seen how to put this method to use in your code to calculate elapsed time. Sadly, this is as far as most developers go; they learn about a new method, then write a script each time they need to apply the technology. They might even construct a "template" class in a file as follows (also stripping the code down to its bare minimum):
class HowFast { I suggest, however, that a much more sensible, productive and elegant solution would be to construct a class that encapsulates the details of the elapsed-time computation and exposes a set of methods to get the job done. I will build such a class in this article, and then extend it to an abstract class so you can easily perform benchmarks on code that you construct.
Performance Comparison Example
The InFile.numBytes method counts the number of bytes explicitly, while InFile.numBytes2 simply takes advantage of the available() method to return the total. By using available() I'm taking a bit of a risk because if it's a very large file, not all bytes will be available. For the purposes of my test, however, this will serve us just fine. I also catch any IO exceptions and return -1 to indicate a problem; that way developers can use numBytes without having to worry about exceptions popping out of the program. Now I'd like to determine which of these implementations is the fastest my hunch is that numBytes2 is faster since it takes advantage of the available() method. I should also see if their behavior changes in response to differently sized files. So let's build a Timer class to help us answer these questions.
The Timer Class
class HowFast1 { The results of running HowFast1 are shown in Figure 2. Notice how much is missing in this new implementation of a timing script?
Of course, I can also perform multiple timings by instantiating more than one Timer object, which is useful when I want to compare different implementations. This approach is shown in Listing 4. The output from executing this script is shown in Figure 3. As you can see, and as I suspected, the version based on available() is consistently faster. Actually, now that I mention it, how can you tell which result goes with which test? Not simply by eyeballing the output. Wouldn't it be nice to be able to include a message or context with the results so you can interpret them more easily? Let's modify the HowFast script once more and see, in Listing 5, how intelligible we can make the output. Now, I have added a word describing each test in the call to showElapsed(). This information is then added to the output message, as shown in Figure 4. You should now have a solid idea of how we want the Timer class to work. Let's explore the implementation.
Class Elements
// Start time elements
// Stop time elements
// Context, as in: description of the For both start and stop points, Timer keeps track of the value returned by System.currentTimeMillis. (mstart and mstop therefore correlate to timeBefore and timeAfter in the original script example.) It also uses boolean elements to remember whether a timing session for the object has started and/or ended. These flags will ensure that valid actions are taken (you can't, for example, end a timing session before you start it). Finally, the mcontext element is a string that will be used to display information about the timing session if the programmer provides context information in the calls to start() and/or stop(). Starting the Timer
// Start the "clock ticking". Of course, The start() method sets the various elements ("I have started but I have not yet ended.") and then constructs the context string based on the value passed in.
Stopping the Timer
public void stop (String context) {
if (context.length() > 0) There isn't much to stop(); it makes sure you started the timer. Then it gets a snapshot of the current time and adds information to the context string.
Retrieving Elapsed Time
As you can see from Table 1, I'm very careful to use previously defined methods to implement other methods. By doing so I avoid code redundancy and get more consistent behavior from the various methods. Here is the implementation of the elapsed() method:
public long elapsed (String context) { If the developer hasn't already stopped the timer, elapsed() calls stop() and returns the difference between the stop and start values. It's important to provide the "raw" elapsed() method, because a programmer might want to take that value and manipulate it further by performing calculations, comparing it to values stored in a hashtable or formatting a different message. The elapsedMessage() method is implemented as follows:
public String elapsedMessage (String
// Shut down the timing ASAP to make The first thing it does is obtain the elapsed time and, as a reminder, the elapsed() method calls end() (if it hasn't already happened). Then it constructs the standard message by incorporating the context information. The toString method is provided so you can reference a Timer object as part of a string concatenation and see something sensible. Here is its implementation: public String toString () { return elapsedMessage(""); } It does nothing more than call elapsedMessage(), passing null for the context (you can't provide arguments to the toString() method if you want it to be used automatically by the Java runtime engine). Finally, there is the showElapsed() method. As you can see below, all it does is display the value returned by elapsedMessage.
public void showElapsed (String context) {
Overloadings for Null Contexts
public void start () { start (""); } These overloadings might seem like an unnecessary step, but users of the Timer class will appreciate this kind of effort. Well, to be honest, they probably won't appreciate it because they won't even notice it. They'll take it totally for granted and that's just fine it means you've designed your code very well.
Improving Ease of Use of Timer
First, I haven't created any constructor methods for the Timer. Yet it seems that whenever I instantiate a new Timer, I almost always want to call the start() method for that object. Why not create a constructor or two that will automatically set the start time? Here are the definitions of those constructors:
public Timer (String context) { this.start I'll make use of these in the subsequent examples. What else might we want to do? Well, most of the time when I'm testing the performance of a program, I don't want to run that code just once. I might want to run it many times, perhaps even thousands, for a number of reasons, including:
class HowFast4 {
Timer bruteForce = new Timer();
for (int execnum = 1; execnum <=
bruteForce.showElapsed("Countem"); When I invoke the class, I pass the number of times to run the code as the first and only argument, as in: java HowFast4 100 and I get a line of output that looks like this: Elapsed from start to Countem: 1723 millisecs The main method calls the Integer.parseInt method to return the numeric value of the integer represented by the contents of the given string object, which is the first element in the string array, args. Then I use this value (the count element) to limit the execution of a numeric for loop. Notice that I'm assuming the presence of the Timer constructor to automatically call start(). If I want to compare the performance of my two different implementations, I end up with code similar to Listing 6, which results in an output like this:
E:\Articles\Java>java HowFast5 250 As I write this code, it strikes me that I'm engaged in a fairly awkward, repetitive process. What if I want to compare three of four implementations? And wouldn't it be nice to see not only the total elapsed time, but also the average amount of time it took for each execution? Notice that I execute the same body of code again and again. Here, for example, is all of the "generic" code that performs the timing:
Timer myTimer = new Timer();
for (int execnum = 1; execnum <= count;
myTimer.showElapsed("Context");
Doesn't it seem that I should be able to create a template of the repetitive stuff and then just "fill in the blanks" in the template when I need it? Well, I can do precisely that with an abstract class.
Driving Timer to Abstraction
abstract class Benchmark { They then extended that abstract class with a "do nothing" benchmark that I must admit I found hard to understand. After playing around with it for a while and reading through other books, I finally felt comfortable enough with abstract classes to design my own benchmarking class -- built on top of the Timer package. Back to "abstract class." What is it? How does it help me get my job done? A class is considered "abstract" if it has one or more abstract methods. An abstract method is a method that has a signature or header but no implementation or body. Here's a very simple abstract class:
abstract class Beliefs What am I expressing in this class? Human beings have belief systems, but their beliefs differ according to many factors, in this case religious affiliation. If I'm a Jew, my belief about God differs from that held by a Christian. The day of the week that is considered my Sabbath or day of rest is also different. On the one hand, humans have common characteristics: views on God and the day of rest. On the other hand, the actual content behind those characteristics (i.e., implementation of the method) is different for each religion or belief system. Once I have defined these general characteristics and created "placeholders" for their implementation, I can extend my abstract Beliefs class for particular faiths (or lack thereof) as shown in the four classes in Listing 7, which generates this output:
Muslims are sometimes found reading the Quran on Friday Notice that even though I create an array of Belief objects (called believers), I can assign subclasses of Beliefs (Jews, Muslims, etc.) to that array without casting to the Shape class. In addition, I can invoke the holyBook() and dayOfPrayer() methods for these Shape objects (the elements of the believer array) -- and the methods defined in each of the various subclasses will be invoked. Thus I can write programs that don't explicitly reference subclasses (Jew, Muslim, etc.) and are therefore more general, but still take advantage of the more specific functionality. This also means that as I extend the abstract class, Beliefs, to other subclasses, existing programs that reference only the Beliefs class will run for the new subclasses as well. That is a brief explanation of abstract classes. Let's see how we can apply this Java feature to my Timer class.
The RunTimer1 Class
This is a somewhat tricky prospect. The code to be tested must be placed inside the FOR loop, but I don't want to have to make them write the FOR loop (and other functionality, as we'll see). Conceptually, I want to build a method that implements the FOR loop but contains a kind of "placeholder" for the developer's code. This is exactly where the abstract class comes in handy. I'm going to extend Timer to an abstract class (remember: Timer is not abstract) called RunTimer1, as follows:
abstract class RunTimer1 extends Timer { As you can see, I extended a nonabstract class to an abstract class. For a class to be abstract it must contain at least one abstract method. In fact, it contains two methods, only one of which is abstract: timeIt() This is the abstract method of the class and therefore contains no implementation. It's the "placeholder" program run by repeat(). It accepts a single object in its parameter list, which is why the class is called RunTimer1 (1 argument version). Here is the complete definition of timeIt() in RunTimer1: abstract void timeIt (Object arg1); Since it's abstract, it presents only the header information everything you need to know to be able to invoke the method. But since it's abstract, you can't invoke it as an instantiation of RunTimer1; what would the Java runtime engine execute? repeat() The method starts the timer, executes the FOR loop calling timeIt() and then shows elapsed time, as well as some new information: total number of iterations executed and the average elapsed time per iteration. Here is the implementation of repeat():
public void repeat (String context,
for (int execnum = 0; execnum <count;
super.showElapsed( context); As you can see, it calls various methods of the superclass, Timer, as it moves through the steps previously described. (I fully qualify the method invocations to show the reliance on the superclass method, but it's not necessary.) The object argument, arg1, is simply passed in to the abstract timeIt() method. After it calls showElapsed() to display the elapsed time, it offers some "added value" appropriate to this looped execution by displaying additional information available only in RunTimer, where it knows about the number of iterations. Notice that the body of the loop consists of a call to the timeIt() method. But what good does that do? The timeIt() method doesn't have a body. In fact, it's impossible to even instantiate the RunTimer1 class much less invoke its repeat() method. You can't instantiate an abstract class. Why? Consider the following class:
class ZoomZoom This class will fail to compile with this error: ZoomZoom.java:5: class RunTimer1 is an abstract class. It can't be instantiated. What sense could it possibly make anyway? myTimer.repeat calls timeIt() and...there's nothing there to run. The only thing you can do with an abstract class is extend it to another nonabstract class. Then we'll be able to obtain the precise desired behavior: my subclass of RunTimer1 will implement its own version of timeIt(), and when it invokes repeat(), the subclass's code will execute and be timed. And that is pretty darn cool.
Extending RunTimer1
class TestInFile extends RunTimer1 { TestInFile is a subclass of RunTimer1; an instantiation of TestInFile (as is seen in the main() method) will thus inherit two methods, timeIt() and repeat(). One of them, repeat(), is "ready to go," but the timeIt() method was never implemented in RunTimer1. That's okay we'll just implement it in TestInFile.
void timeIt (Object arg1) { This very first method in the class has the same signature as timeIt in RunTimer1. This overloading, however, adds an implementation that defines timeIt to be a "pass-through" to call the InFile.numBytes() method. It converts the single object argument to a string that contains the file name. When I extend an abstract class, I must provide an implementation for each and every abstract method in my abstract superclass (in this case, just one). If I don't, then my subclass is automatically an abstract class, even if it's not declared explicitly as such. Now my subclass will compile and I'll be able to run the repeat() method. So let's now take a look at the main() method and explore how it does its testing:
public static void main (String[] args) { First, main() instantiates a TestInFile object. I can do this since TestInFile extends an abstract class and provides an implementation of its abstract methods. I then call the inherited repeat() method, passing in the context or name of the test, the number of iterations and the name of the file. When repeat() executes, it calls timeIt(). Then the wonders of polymorphism come to the fore, so TestInFile's timeIt() is executed rather than RunTimer1's timeIt() abstraction. Here is the output from an execution of TestInFile:
E:\Articles\Java>java TestInFile 250 c:\temp\te_event.pks
Comparing Performances with RunTimer
Not as far as I can tell. I'd need to replace the abstract timeIt with two different implementations, one calling numBytes and another calling numBytes2. That doesn't seem possible. I could, however, create two different classes and run them separately, generating output like the following:
E:\Articles\Java>java TimeNumBytes 250 c:\temp\te_event.pks
E:\Articles\Java>java TimeNumBytes2 250 c:\temp\te_event.pks This works, but it's not as clean as I'd like. Create a new class just to test a different implementation? Yuck! An alternative is to create another version of the RunTimer1 class that has two abstract methods (or, to extend the concept, n methods to test n implementations) and two repeat methods, as in RunTime1x2 (timeIt1 and timeIt2 each accept 1 argument) in Listing 8. Then I can create a class to time both of my implementations using the abstract class, as seen in Listing 9. Here is the output from running this class's main() method:
E:\Articles\Java>java TimeBoth 250 c:\temp\te_event.pks This isn't the ideal solution as far as I'm concerned. You'd have to create a new version of RunTimer for each combination of the number of arguments to timeIt() and the number of implementations you want to compare. There may be a more flexible way to support this variability, but I haven't found it yet.
Summary
LATEST JAVA STORIES & POSTS
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK SPONSORED BY INFRAGISTICS
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||