Java Compilation and its Interpreter
Platform independence is one of the key benefits of Java. We need to understand what platform dependency is, hence, we also need to know what compulsion is. In this post, I will talk about how java programs run from scratch, back to how computers read codes from a high-level view. In addition, I will also talk about the interpreter in Java and what are JVM, Java SE, JDK, and JRE. As a new learner of java programming, it is important to clear the concept out from scratch in order to get on the road to being a good Java engineer.
How the computer reads instruction?
Every computer has a fixed set of instructions that it understands and the CPU (processor) will understand these sets of instructions. The program is made of sets of instructions. This allows us to write code to perform meaningful tasks where each instruction tells the CPU to perform certain actions as sequences of instruction.
Each instruction is basically sequences of 0s and 1s and is called the binary format. The code you write is called the source code and the Machines language, also called the machine code or native code, is what a computer understands. In addition, Computer's main memory is made of transistors that switch between high or low voltage levels.
Since there are only 2 options, the reading of high and low level is done by the processor, the CPU, which uses the transition state to control other devices or to give instruction. Each of these sets of high and low voltage readings formed sequences of binary as instruction.
CPU executes the machine language using the fetch-and-execute Cycle. When the program is executed, the set of instructions are loaded into the memory. It then fetches the instructions and data to the CPU. The result after execution will be written back to the memory. This cycle continues until all instructions in the program are executed.
It is very troublesome for humans to write instructions into 0s and 1s. Hence, engineers developed computer languages that help us to write instructions easier.
A compiler takes your source code to make machine code. It also verifies syntax and semantics of the source code, performs code optimization.
However, if you compile your source code in Windows run in Mac or Linux, it might not work as they are an indifferent operating system that has different machine code instruction and this is called the platform dependency.
Let’s say we want to run a C program using GCC compiler, the file of the result generated from windows and Linux are exe files and out files respectively. However, it is not possible for exe files generates in a windows machine to run in Linux. This is because file format in windows (PE, portable executable) and Linux (ELF, executable and linkable format) are different, as well as the system calls to perform operation like files opening, etc.
In Java, there is no platform dependency.
An interpreter is different than the compiler and it directly executes the source code and generates the result.
You can see the similarity between a CPU and an Interpreter.
As we have mentioned in the Fetch-and-Execute Cycle, an interpreter is similar but it precompiled machine code in its library. Java uses compilation to achieve fast execution speed and it uses interpreters to achieves platform independence.
The source code (.java file) of Java will process to a Java compiler (which also can compile other JVM-based programming languages such as Kotlin or Scala) to generate Java bytecode (.class file, the instruction sets concept that we mentioned before). The java bytecode is not a machine code but an intermitted format, which it can be run on any operating system that has installed a Java interpreter (JVM, Java virtual machine) to generate results. It is an optimized version of the Java source code which can be run faster. In addition, JVM (Interpreter) also performs Just-In-Time (JIT) to achieve optimization as well.
JVM - Java Virtual Machine
JVM is also called the abstract computing machine (This is why it is called virtual). It manipulates memory at runtime with automatic memory management and interpreting Java bytecode.
When a java program is executed, an instance of JVM is created in the memory. It then internally informs the java bytecode generate from the Java compiler and executes it. At the time JVM runs the bytecode using the ClassLoader in the JVM, it allocates memory in the JVM memory area (also called the runtime data areas) from the underlying operating system, which is located in Heap. The bytecode will then pass to a verifier to check if the loaded class file is not corrupted. Once there are no issues, it will then pass to the execution engine which included the JIT Compiler and Interpreter. Lastly, there is a Garbage Collector which reclaims garbage or memory occupied by objects that are no longer in use by the program.
JIT Compilation - Dynamic Compilation
JIT Compilation in the execution engine inside JVM can identify which part of the Java bytecode that are frequently executed. It marks them as hot spots and saves the frequently compiled machine code into the Cache Machine Code which is saved in the memory for future use. Whenever in the future, if the corresponding code is executed, the cache machine code is then called for faster execution.
JDK - Java Development Kit
JDK refers to the Java SE Runtime Environment which used to run Java programs where Java SE (J2SE) refers to the Java Standard Edition, which defines the specification of Java, including Java Language Specification (JLS, define entire Java languages e.g. syntax), JVM Specification (bytecode definition) and Java API Specification (libraries).
We can now understand JDK is the blueprint environment setting for running Java programs and within JDK, it includes the development tools (Java compiler) and JRE (Java Runtime Environment, includes JVM and Java API).
It is a great start for reading this post before you get your hand dirty on Java from scratch or other programming languages. I have mentioned few important concepts included how computers execute what you write and how the Java program is executed behind. In addition, I have also talked about what are JDK, JVM, JRE Java SE, etc, important concepts. It is always good to understand these concepts to write or to learn java code in the future.