Aug 25, 2011

Java Buzz words

          Java: A simple, object-oriented, network-savvy, interpreted, robust, secure, architecture neutral, portable, high-performance, multithreaded, dynamic language.One way to characterize a system is with a set of buzzwords. We use a standard set of them in describing Java.

      James Goslings team wanted to build a system that could be programmed easily without a lot of esoteric training and which leveraged today's standard practice. Most programmers working those days use C, and most programmers doing object-oriented programming use C++. So even though they found that C++ was unsuitable, they designed Java as closely to C++ as possible in order to make the system more comprehensible.
Java omits many rarely used, poorly understood, confusing features of C++ that in their experience bring more grief than benefit. These omitted features primarily consist of operator overloading (although the Java language does have method overloading), multiple inheritance, and extensive automatic coercions.
     They added automatic garbage collection, thereby simplifying the task of Java programming but making the system somewhat more complicated. A common source of complexity in many C and C++ applications is storage management: the allocation and freeing of memory. By virtue of having automatic garbage collection (periodic freeing of memory not being referenced) the Java language not only makes the programming task easier, it also dramatically cuts down on bugs.
        Another aspect of being simple is being small. One of the goals of Java is to enable the construction of software that can run stand-alone in small machines. The Java interpreter and standard libraries have a small footprint. A small size is important for use in embedded systems and so Java can be easily downloaded over the net.

            This is, unfortunately, one of the most overused buzzwords in the industry. But object-oriented design is very powerful because it facilitates the clean definition of interfaces and makes it possible to provide reusable "software ICs."
Simply stated, object-oriented design is a technique that focuses design on the data (=objects) and on the interfaces to it. To make an analogy with carpentry, an "object-oriented" carpenter would be mostly concerned with the chair he was building, and secondarily with the tools used to make it; a "non-object-oriented" carpenter would think primarily of his tools. Object-oriented design is also the mechanism for defining how modules "plug and play."
The object-oriented facilities of Java are essentially those of C++, with extensions from Objective C for more dynamic method resolution.

             Java has an extensive library of routines for coping easily with TCP/IP protocols like HTTP and FTP. This makes creating network connections much easier than in C or C++. Java applications can open and access objects across the net via URLs with the same ease that programmers are used to when accessing a local file system.

         Java is intended for writing programs that must be reliable in a variety of ways. Java puts a lot of emphasis on early checking for possible problems, later dynamic (runtime) checking, and eliminating situations that are error prone.
One of the advantages of a strongly typed language (like C++) is that it allows extensive compile-time checking so bugs can be found early. Unfortunately, C++ inherits a number of loopholes in compile-time checking from C, which is relatively lax (particularly method/procedure declarations). In Java, we require declarations and do not support C-style implicit declarations.
The linker understands the type system and repeats many of the type checks done by the compiler to guard against version mismatch problems.
The single biggest difference between Java and C/C++ is that Java has a pointer model that eliminates the possibility of overwriting memory and corrupting data. Instead of pointer arithmetic, Java has true arrays. This allows subscript checking to be performed. In addition, it is not possible to turn an arbitrary integer into a pointer by casting.
While Java doesn't make the QA problem go away, it does make it significantly easier.
Very dynamic languages like Lisp, TCL and Smalltalk are often used for prototyping. One of the reasons for their success at this is that they are very robust: you don't have to worry about freeing or corrupting memory. Java programmers can be relatively fearless about dealing with memory because they don't have to worry about it getting corrupted. Because there are no pointers in Java, programs can't accidentally overwrite the end of a memory buffer. Java programs also cannot gain unauthorized access to memory, which could happen in C or C++.
One reason that dynamic languages are good for prototyping is that they don't require you to pin down decisions too early. Java uses another approach to solve this dilemma; Java forces you to make choices explicitly because it has static typing, which the compiler enforces. Along with these choices comes a lot of assistance: you can write method invocations and if you get something wrong, you are informed about it at compile time. You don't have to worry about method invocation error.
           Java is intended for use in networked/distributed environments. Toward that end, a lot of emphasis has been placed on security. Java enables the construction of virus-free, tamper-free systems. The authentication techniques are based on public-key encryption.
There is a strong interplay between "robust" and "secure." For example, the changes to the semantics of pointers make it impossible for applications to forge access to data structures or to access private data in objects that they do not have access to. This closes the door on most activities of viruses.
Architecture Neutral
              Java was designed to support applications on networks. In general, networks are composed of a variety of systems with a variety of CPU and operating system architectures. To enable a Java application to execute anywhere on the network, the compiler generates an architecture-neutral object file format--the compiled code is executable on many processors, given the presence of the Java runtime system.
This is useful not only for networks but also for single system software distribution. In the present personal computer market, application writers have to produce versions of their application that are compatible with the IBM PC and with the Apple Macintosh. With the PC market (through Windows/NT) diversifying into many CPU architectures, and Apple moving off the 680x0 toward the PowerPC, production of software that runs on all platforms becomes nearly impossible. With Java, the same version of the application runs on all platforms.
The Java compiler does this by generating byte code instructions which have nothing to do with a particular computer architecture. Rather, they are designed to be both easy to interpret on any machine and easily translated into native machine code on the fly.
            Being architecture neutral is a big chunk of being portable, but there's more to it than that. Unlike C and C++, there are no "implementation dependent" aspects of the specification. The sizes of the primitive data types are specified, as is the behaviour of arithmetic on them. For example, "int" always means a signed two's complement 32 bit integer, and "float" always means a 32-bit IEEE 754 floating point number. Making these choices is feasible in this day and age because essentially all interesting CPUs share these characteristics.
The libraries that are a part of the system define portable interfaces. For example, there is an abstract Window class and implementations of it for Unix, Windows NT/95, and the Macintosh.
The Java system itself is quite portable. The compiler is written in Java and the runtime is written in ANSI C with a clean portability boundary. The portability boundary is essentially a POSIX subset.
         Java byte codes are translated on the fly to native machine instructions (interpreted) and not stored anywhere. And since linking is a more incremental and lightweight process, the development process can be much more rapid and exploratory.
As a part of the byte code stream, more compile-time information is carried over and available at runtime. This is what the linker's type checks are based on. It also makes programs more amenable to debugging.
High Performance
              While the performance of interpreted byte codes is usually more than adequate, there are situations where higher performance is required. The byte codes can be translated on the fly (at runtime) into machine code for the particular CPU the application is running on. For those accustomed to the normal design of a compiler and dynamic loader, this is somewhat like putting the final machine code generator in the dynamic loader.
The byte code format was designed with generating machine codes in mind, so the actual process of generating machine code is generally simple. Efficient code is produced: the compiler does automatic register allocation and some optimization when it produces the byte codes.
In interpreted code we're getting about 300,000 method calls per second on an Sun Microsystems SPARCstation 10. The performance of  byte codes converted to machine code is almost indistinguishable from native C or C++.
           There are many things going on at the same time in the world around us. Multithreading is a way of building applications with multiple threads Unfortunately, writing programs that deal with many things happening at once can be much more difficult than writing in the conventional single-threaded C and C++ style.
Java has a sophisticated set of synchronization primitives that are based on the widely used monitor and condition variable paradigm introduced by C.A.R.Hoare. By integrating these concepts into the language (rather than only in classes) they become much easier to use and are more robust. Much of the style of this integration came from Xerox's Cedar/Mesa system.
Other benefits of multithreading are better interactive responsiveness and real-time behavior. This is limited, however, by the underlying platform: stand-alone Java runtime environments have good real-time behavior. Running on top of other systems like Unix, Windows, the Macintosh, or Windows NT limits the real-time responsiveness to that of the underlying system.
             In a number of ways, Java is a more dynamic language than C or C++. It was designed to adapt to an evolving environment.
For example, one major problem with C++ in a production environment is a side-effect of the way that code is implemented. If company A produces a class library (a library of plug and play components) and company B buys it and uses it in their product, then if A changes its library and distributes a new release, B will almost certainly have to recompile and redistribute their own software. In an environment where the end user gets A and B's software independently (say A is an OS vendor and B is an application vendor) problems can result.
For example, if A distributes an upgrade to its libraries, then all of the software from B will break. It is possible to avoid this problem in C++, but it is extraordinarily difficult and it effectively means not using any of the language's OO features directly.
By making these interconnections between modules later, Java completely avoids these problems and makes the use of the object-oriented paradigm much more straightforward. Libraries can freely add new methods and instance variables without any effect on their clients.
An interface specifies a set of methods that an object can perform but leaves open how the object should implement those methods. A class implements an interface by implementing all the methods contained in the interface. In contrast, inheritance by subclassing passes both a set of methods and their implementations from superclass to subclass. A Java class can implement multiple interfaces but can only inherit from a single superclass. Interfaces promote flexibility and reusability in code by connecting objects in terms of what they can do rather than how they do it.
Classes have a runtime representation: there is a class named Class, instances of which contain runtime class definitions. If, in a C or C++ program, you have a pointer to an object but you don't know what type of object it is, there is no way to find out. However, in Java, finding out based on the runtime type information is straightforward. Because casts are checked at both compile-time and runtime, you can trust a cast in Java. On the other hand, in C and C++, the compiler just trusts that you're doing the right thing.
It is also possible to look up the definition of a class given a string containing its name. This means that you can compute a data type name and have it easily dynamically-linked into the running system.


Post a Comment