Let's Talk About Java Portability

Michael Edwards
Developer Technology Engineer
Microsoft Corporation

May 1997

Getting Started

I have long suspected that Java™ is high on that list of subjects every clueless know-it-all talks about incessantly. You can usually identify these topics by discussions that trend toward a dichotomy of viewpoints—listening to these people, you eventually feel compelled to believe that Java either will bring about world peace and pay off the U.S. national debt or it is just a complete waste of time. I often want to wade into these discussions, but I try to keep my mouth shut until I can offer informed opinions.

Imagine my surprise (or should I say horror) when I got back from Christmas vacation to discover that my next writing assignment required learning Java. Can you say "Be careful for what you wish, because you just might get it"? However, I was happy to find this learning experience much easier than I anticipated. Java is object-oriented by design, and as a C programmer who was dragged biting and scratching into the C++ world, it was rewarding to discover that I actually did learn a few principles and techniques in object-oriented programming over the years. Yes, I admit it, I am an OLE idiot (MSDN subscribers who have read Nigel Thompson's series of articles on OLE will understand what I mean), at least in terms of shipping production code. However, I am slowly leaving the ranks of the clueless, at least as far as Java is concerned.

My first decision—and it was a good one—was to avoid picking up one of those Program Java in Five Days While You Brush Your Teeth books. If you've walked through the computer technology section of a bookstore recently (yeah, like you can stay away), you know there are legions of Java books out there. To narrow your search, you can look for books focusing on your current strengths. Because I have a strong C background, I looked for books written for career C programmers who want to learn Java. For example, I found Java in a Nutshell by David Flanagan (Specialized Systems Consultants, December 1996. ISBN: 1556592183). I also picked up The Java Series published by Addison-Wesley (for information, see http://www.awl.com/cp/javaseries/), and they were really useful, but mostly as programming references. In keeping with my philosophy of learning by doing, it didn't take long for me to plunge in and get my hands dirty. I started out by writing a sample applet that seemed useful at the time, but quickly realized that I knew only enough to be dangerous and needed more education. So I started browsing the Microsoft Java developer Web site (http://www.microsoft.com/java/) and the Sun MicroSystems site (http://www.javasoft.com/), and spent some time at the following sites (this is not a comprehensive list):

Feeling invigorated with sort of an insider's perspective on Java, I went back to the sample applet that I had started (what seemed like) eons ago, but was quickly sidetracked again, reading sample applet sources in Sun's Java Development Kit (JDK)—available for download at http://www.javasoft.com/nav/developer/index.html. Naturally, reading the sample sources turned into yet more diversions as I started looking at the sources that implemented the system classes.

It turned out that my approach to a Java education was really successful. By picking a good introductory book that addressed my C background, I obtained a fairly solid, basic understanding of Java. By venturing to the Web for the next phase of my education, I got the latest news and information on Java development tools and issues. And reading source code gave me a good grounding in the details of Java's architecture and implementation.

This first article in my series on Java will help the average C programmer answer the question, "What's really the deal with this Java portability stuff?" I will focus on the design aspects of Java that are intended to make Java applications more portable across different computer platforms than those of other technologies. However, if you really want to understand the complete picture, the theory behind Java's well-touted portability is only part of your education. I have been seeing an increasing number of postings in the Java newsgroups and in other sources about real problems that arise when the rubber meets the road. Yes, people are complaining about Java compatibility problems, and I intend to get to the bottom of this. If you would find an article addressing this topic useful, I would appreciate a note telling me so (send me e-mail at michaele@microsoft.com).

About Portability

Providing for the creation of highly portable applications is a central element of the Java design philosophy. Most of my own coding experience has been in C, and when I started my career as a computer game programmer in the '80s, I lived and breathed the issues of portable coding. The game company I worked for supported several platforms over the years, including the Apple II, C64, Mac, Amiga, Atari ST, and PC—with game console versions always a looming possibility. So, "portability awareness" was my middle name. I learned an important lesson while I was porting games across these platforms: There is no such thing as completely portable code; there is only more portable code and less portable code. I also learned that adhering to a process usually enables the production of more portable code. In our case, the process included coding in C as much as possible and (not coincidentally) using what we called a virtual machine—a basic library of code we ported to different platforms and which (for us) served as a virtual graphics and user input library for running our game programs. So, from my own hard-won experiences creating portable software, I can attest that Java generally uses tried-and-true techniques to make more portable code easier to achieve.

In the interest of providing a balanced view, I should note that the virtual machine approach to treating all platforms as equal comes with certain trade-offs. To illustrate the point, let me tell a story about the last project I worked on before I quit making computer games for a living. In 1988, my game company was in the midst of making a tough business decision about whether to drop Amiga and Atari ST support for our PC games. I was young and naďve, and this infuriated me. We had created one of the first original hits on the Amiga—how could we even consider abandoning the best game computers ever created? So I worked night and day for three months porting our virtual machine to the Amiga and Atari ST platforms, convinced that I could reduce the problem of releasing these versions into a simple recompile. Naturally, I pulled off a sweet port of our virtual machine, and I even demonstrated the Amiga and Atari ST versions of one of our games after less than a week's effort. Can you guess what happened? Well, we abandoned the platform anyway. Nobody who owned an Amiga in 1988 would shell out good money for a straight port of a PC game. If they had wanted to own a PC, they would have bought a PC. No, they wanted a real game computer (hey, this was 1988 after all), so they bought an Amiga. I had learned an expensive lesson: A virtual machine forces you to release software that codes to the least common denominator of the architectures you intend to support. Faithfully supporting the features available on different architectures requires extra work and isn't always a viable option.

A quick clarification: In his review of this article, my fellow programmer/writer Paul Johns pointed out that I am mixing metaphors by implying that portable code is the same as cross-platform code. Portability implies ease of recompiling your software so you can run it on another platform, whereas (in the definition that Java has helped establish) cross-platform means compiling your code once and being able to run it anywhere (or, as the detractors say, compile it once, test it everywhere, and run it anywhere it worked). I think Paul is right—true cross-platform code isn't the same as portable code, but you could consider it the mother of all portable code.

So what makes Java so suitable for meeting the requirements of . . . let's call it cross-platform portability? Three main things: the architecture provides source-code–level portability; the application programming model is extensively and precisely specified (essentially as a part of the language itself); and certain elements of the language lend themselves to easier portability.

Virtual Machine

Over and over, I have read that Java is an interpreted programming language. However, this is not strictly correct; it’s more accurate to say that Java is a language that’s usually (not always) compiled to byte codes, and that the byte codes are sometimes (but not always) interpreted. Thus, Java source code is not translated directly into the low-level machine instructions that are native to a particular computer. But wait a minute—while I was tinkering with sample Java applets, I used a compiler to produce a Java class file from its source code. And when I tested the compiled applets with Microsoft Internet Explorer, I used Microsoft's Just-In-Time compiler (see “Description of the Just-In-Time Compiler“ in the MSDN Library, Knowledge Base article Q154580) to speed up Java applet performance.

What exactly is being compiled here, and what is being interpreted? Well, the binary instructions (called byte codes) produced by the Java compiler comprise a brand-new set of low-level machine instructions that are native to a virtual computer, or a virtual machine. A virtual machine for interpreting compiled Java byte codes can be created as a software engine running on essentially any computer—it can even be created as a new hardware platform that uses Java byte codes for its own native machine instructions. In other words, the Java Virtual Machine can be implemented in hardware or in software, and the Java program won't "know" the difference. In fact, if the Sun Microsystems JavaStation (one idea for a so-called Network Computer) directly executes Java byte codes (I can't really tell), we might think of Java programs as being emulated on computers with software implementations of the Java Virtual Machine. The concept of emulating a program written for one computer architecture on a completely different computer is certainly not new. For example, there are folks who love their old 8-bit games so dearly that they have taken great pains to play them on their PCs. But Java's precisely defined language specification and virtual machine was created with the intent of running it on a variety of other architectures. And that is at the core of what makes Java so interesting for people who care about cross-platform portability—it provides a precisely defined language and virtual machine specification for creating a virtual Java computer on any computer platform.

This sounds great in theory, but skeptical reviewers of this article expressed their concern about the huge challenges associated with offering identical behavior across different code bases and different platforms. Indeed, a number of relatively new Java implementations are out there, so we can start gathering information about how well this theory is turning into reality. If I had to decide whether to recommend a move to Java to take advantage of its cross-platform nature, I would study Java's actual compatibility benefits very carefully. (As Ronald Reagan said, "It's still trust, but verify. It's still play, but cut the cards. It's still watch closely. And don't be afraid to see what you see.") At the very least, I'd be testing the software everywhere I intended to run it, and I'd make a lot of noise about the problems I found. The squeaky wheel gets the oil; the others just break.

If you read the Knowledge Base article Q154580, “Description of the Just-In-Time Compiler“ (MSDN Library, Knowledge Base), you can see that it is possible to change the manner in which Java programs are run by compiling the Java byte codes to native code just before they are executed by the virtual machine. So, instead of interpreting the program, you can execute it natively. This gives you source-level portability, while allowing you to retain the benefits of a program that is, in effect, compiled for your computer in the first place.

I can hear a violent argument brewing between the know-it-alls about portability: The C fanatics are loudly claiming that C is portable by virtue of recompiling, and the Java fanatics are proclaiming recompilation as a pathetic excuse for portability. I'm plugging my ears—it is almost irrelevant once you understand that the interpreted aspect of the Java language (which enables cross-platform support at the source-code level) is actually just part of the Java portability story.

Not Just Another API

An interpreted language, in and of itself, does not make its programs inherently portable across platforms because there are all kinds of elements in a program that are dependent on the underlying operating system. And the world hasn't agreed on a common method for obtaining these elements from the various operating systems in use today (unless you want to argue that Java does this). Hence, in addition to mastering their preferred programming language, software developers are stuck with learning various application programming interfaces, or APIs. So even if you know C like the back of your hand, developing C applications for multiple platforms means that you are going to become an expert in multiple APIs as well. In fact, differences between the APIs available on target platforms are critical in the design of a portable program. Sun addresses this nasty portability thorn by specifying a standard Java API for what they define as "Java-compatible" platforms. But you know what? The concept works equally well for any Java API that is truly supported across platforms. Merging the concept of a language with a particular API to talk to the operating system is like saying, "If you are going to write in C, you have to use the Win32® API." At the risk of being restrictive, combining the notions of the language and the API into a single package is a huge step toward the ideal of creating portable software. Voila! Now I understand what Java really means. It is not just a language, it is not just an API, it is both! But wait, if you order before midnight tonight you will also get this amazing set of knives . . .

Going to a single source for both the language and the API does have a potential downside: If you don't like the API, or the parts you need aren't released yet, or you find blocking problems with it, or . . . well, sorry, there's nothing you can do. However, the Java API can certainly be expanded, and indeed Sun's Java team is hard at work doing just that (see http://java.sun.com:81/products/api-overview/index.html ) to address problems such as extremely simplistic graphics and media support in their two shipping versions. Of course, expanding the Java API, when it isn't done through additional classes built on top of the existing Java virtual machine (VM), introduces new portability concerns. Adding the new stuff as an expansion of the standard Java VM introduces problems if the new VM is released before it is implemented across all the Java-compliant platforms. The one-time annoyance of downloading the new VM will also jab folks a bit, especially if their VM is an intrinsic part of the browser and they have to update that, too. On the other hand, optional extensions to the VM are also problematic if they aren't available on all the platforms you care about.

Since Java is interpreted, creating new class libraries in Java is another approach you can take to expanding the API in a cross-platform manner. That is, you can extend functionality in an entirely portable and compatible fashion by using the Java language itself. Of course, if you can't address critical problems or deficiencies in the underlying VM, you may be limited in what you can do. But your Java program would be portable. In fact, a number of companies are releasing Java class libraries with all kinds of cross-platform, expanded functionality—for example, Microsoft's new Application Foundation Classes (AFC). For more information about AFC, see http://www.microsoft.com/java/default.htm.

In versions 1.0.2 and 1.1 of the JDK, 22 Java user interface classes describe the set of objects that must be natively implemented using the operating system of the local machine. To accomplish this, Java associates a peer object with each of these special 22 classes. The peer object is responsible for drawing the object and delivering user-input events. These 22 classes are part of the Advanced Windowing Toolkit (AWT) and provide a well-defined interaction between the Java and native code components of the Java VM, as well as a platform-specific look and feel (see "Details of the Component Architecture" on Sun's Java Tutorial site at http://www.javasoft.com/nav/read/Tutorial/ui/components/peer.html). A peer object implements most of its functionality in native methods. If you traced your Java code in a debugger up to a native method call on a Windows®-based machine, and stepped into the native code, you would come out on the other side in a Windows dynamic-link library (DLL). If you are like me and always like to know how things work, you can check out Microsoft's Java SDK site (http://www.microsoft.com/java/sdk/default.htm) to find out how native methods function in Microsoft's Java VM (click "Raw Native Interface" in the left frame). I'm still examining how well this platform-specific look-and-feel stuff works (and how it is changing) because I think it is a critical chapter in the cross-platform story.

Laying out user interface components within a container (see the layout section of the Java Tutorial at http://www.javasoft.com/nav/read/Tutorial/ui/layout/index.html) also has potential portability problems that Java has sought to address through a layout manager architecture that allows plug-in custom managers. To be truly portable, an object that contains other objects should not have to worry about how the contained objects are physically positioned. Otherwise, some of the container's code will need to handle things like screen resolution, container size, viewing properties, and other irrelevant details. By associating a Java Container object with a LayoutManager object, you can solve this layout problem in one fell swoop. And you can create all sorts of views on a container through custom layout managers (again, see the Java Tutorial at http://www.javasoft.com/nav/read/Tutorial/ui/layout/custom.html). For example, being able to plug in your own layout manager allows AFC to replace Sun's implementations, and to do creative things like hosting ActiveMovie™ content in a ListView container. So if you care about viewing your Web page with different screen resolutions or hosting components in containers they've never heard of before, Java's Layout Manager interface can be a solution.

A More Portable Language

Finally, elements of the Java language itself make it easier to create cross-platform applications. That does not mean that you cannot achieve decent portability through other means, for example, by using C with a good C reference library such as the GNU C Library. And, as I've already stated, creating portable software is mostly a matter of having good procedures in place. Nevertheless, the Java language does a number of things well from a portability point of view.

For example, as I mentioned earlier, the Java creators have tried very hard to make sure that no aspects of the language are left up to Java implementers to determine. Take, for instance, integer-variable memory size. In Java, a short is always 16 bits, and a long is always 32 bits. This strict definition of variable sizes is more limiting than C, where integer sizes can grow with the architecture, but it can also be more portable than an assurance that sizeof(short) <= sizeof(int) <= sizeof(long). Similarly, Java floating-point variable sizes are fixed. (And all floating-point operations conform to a single standard, IEEE 754. See the Principles of Computer Architecture, "Appendix A" at http://athos.rutgers.edu/~murdocca/POCA/POCA.html.) In fact, the Java language specification precisely defines the size of all the fundamental data types that ANSI C leaves implementation-dependent.

Extending the idea of precisely defining the size of data types, the Java language has settled on using the Unicode version 2.0 character set exclusively. If you've ever had to drag some piece of code kicking and screaming between platforms that support different character sets, you can appreciate how irritating this can be. Of course, you can use the ANSI code page of the Unicode character set, but your characters will never be 8-bit, so you won't see any Java sources #ifdef'd to death for multiple character sets. For that matter, you won't see any #ifdefs at all, because Java doesn't have a preprocessor. Sun's Java team figured you don't need one since Java provides better ways to create an equivalent of the #define statement, and conditional compilation is generally used to handle different platform-dependent code paths anyway. So far I haven't found this to be a problem, but I guess I am not completely convinced yet.

Many of the things you often find in an API are instead included directly in the Java language, so many of the API elements that are nonportable in cross-platform code are inherently portable in Java. For example, synchronization is included. If you come from a Win32 background, you are familiar with synchronization functions for protecting named sections of your code. In Java, synchronization is specified on an object level. This implies that your application's design satisfies synchronization requirements in a naturally occurring set of reusable objects. Of course, there is no reason you couldn't build special Java classes to be used exactly like critical sections in "flat" Win32 code.

Threading is another area that is often found in an API, but is instead built directly into the Java language. In Java, determining your threading model goes hand-in-hand with distilling your application into its self-contained tasks, or objects. Also, as with synchronization, where Java builds serialization into object-level locks, Java's methods for coordinating the activity of your threads occur on an object level. True to my heritage, I find myself comparing Java's functionality with Win32 functionality, and find Java lacking. For instance, using synchronization objects in wait functions (the only way to coordinate multiple threads of execution in Java) is just one of many ways you can control thread execution in Win32. Java provides only for single-object blocking and gives you no way to specify which thread to alert in multiple blocked threads, whereas Win32 offers multiple-object blocking as well as alertable wait functions (in the MSDN Library, search for "Wait Functions" in the Platform SDK). If you have worked around Java's limitation, good for you, but you'd better make sure your program doesn't break with future releases of Java.

One of the biggest usability problems with today's software is not even related to poor user interface design. Nope, it is far simpler than that—the problem is buggy software. Java takes a pragmatic approach to those ubiquitous, unexpected program errors; it forces developers to consider how their program might fail by making checked exceptions an integral part of the language and by throwing appropriate exceptions in the JDK class implementations. Checked exceptions allow the compiler to verify, at compile time, whether a piece of code is handling problems that can occur while executing a given function. In other words, unless you do your error handling correctly, you won't be allowed to compile, let alone execute. Gee, what a concept! When was the last time you groaned and whacked a nearby object after an inexplicable software failure? In my experience, creating robust software requires a professional and concerted effort that begins well before a single line of code has been written, and I can appreciate the extra level of insurance Java provides to make sure this effort will be made. Of course, some C++ compilers (including Microsoft's) include checked exceptions, but this feature is merely a nonstandard language element that can be incorporated into an API if desired. Compile-time checking of errors provides another example of how merging the concept of a language with an API has certain advantages. (On the Web, I came across the paper "Experiences Converting a C++ Communication Software Framework to Java," at http://siesta.cs.wustl.edu/~pjain/java/java_notes.html, that you might want to check out if you find this topic particularly interesting.)

Should You Really Care About Portable Software?

The applet I created while teaching myself Java is the only thing I have ever written that ran on multiple platforms without requiring anything "extra" (except for testing, of course, and I did find a couple of minor problems of my own making that I had to fix). On the other hand, I have talked to a number of people who are not at all convinced that the vast majority of software developers care about writing code that runs everywhere. After all, Windows and MS-DOS® remain the dominant platform of our age. But that platform-centric view ignores the fact that content is king on the Web, and that you want everyone to be able to access your content in the present and in the future.

Even if you don't care about portability, there are others who do. For example, in the enterprise environment, automatic compatibility between all mainframes, servers, desktops, portables, and handhelds could mean an enormous reduction in on-site programming and tech support. And, of course, Web publishing reaches platforms as diverse as Unix workstations and Windows CE Handheld PCs.

I've talked to people who think that Java is over-hyped and will soon become another failed and forgotten technology. Others believe that such bleak assessments of Java's future condemns people to live in the past. The truth, I suspect, lies somewhere in the middle: Java is a tool, and like any other tool, it is valuable only when used correctly.

Did you find this article useful? Any suggestions? Gripes? Compliments? Drop me a line at michaele@microsoft.com and let me know.