Showing posts with label Java. Show all posts
Showing posts with label Java. Show all posts

Tuesday, March 09, 2021

Understanding the output generated by PrintCompilation flag

The last example from our previous post produced the following output

> java -XX:+PrintCompilation Main2 | Select-String -Pattern calculate
     76   68       3       Main::calculate (9 bytes)
     79   72       4       Main::calculate (9 bytes)
     81   68       3       Main::calculate (9 bytes)   made not entrant

>

Lets try to understand what this output means

Frist column is the no. of milliseconds elapsed since the start of the program. This indicates the time at which our method calculate() is JIT compiled

The second column here is the compilation id. Each compilation unit gets a unique id. 68 on 1st and 3rd lines in the above output indicates they refer to the same compilation unit. 

The third column is blank in our output. Its a five character string, representing the characteristics of the code compiled

% - OSR compilation.
s - synchronized method.
! - Method has an exception handler.
b - Blocking mode.
n - Wrapper to a native method.

Fourth column is a number from 0 to 4 indicating the tier at which the compilation is done. If tiered compilation is turned off, this column will be blank.

Fifth column is the fully qualified method name

Sixth column is the size in bytes - size of the byte code that is getting compiled 

Last column contains the message of the deoptimization done - made not entrant in our sample output


 


Watching JIT in Action

How can we find what JIT is doing to our code at runtime? And how can we figure out which of our methods are getting compiled at runtime and when?

We have a java command line flag -XX:+PrintCompilation which when included, logs all the JIT compile events to standard output.

Lets see this in action. We will start with the below code

public class Main {
    
    static final int LOOP_COUNT = 10 * 10; //100

    public static void main(String[] args) {

        for (int i = 0; i < LOOP_COUNT; i++) {
            calculate();
        }
    }
    
    static void calculate() {
        double value = Math.random() * Math.random();
    }
}

We have the calculate() method, which creates two random numbers and multiplies them. This method is called in a loop from the main() method. We start with a loop count of 100. 

Execute this program with PrintCompilation flag, and watch for JIT compilation of compute method in the output, using the below command

> java -XX:+PrintCompilation Main | Select-String -Pattern calculate

>

Note: Above command is run on Windows OS, using Select-String -> an equivalent of grep for powershell

Without grep, we can see a lot of lines in the output - the logs from JIT compilation of java library methods.

Here we grep for "calculate" in the generated output. We do not see any compile event log for our method, indicating that our method calculate() is not JIT compiled this time.  

We will now increase the loop count to 10K and watch out for JIT compilation event for our calculate() method. The code now is as shown below

public class Main1 {
    
    static final int LOOP_COUNT = 100 * 100; //10K

    public static void main(String[] args) {

        for (int i = 0; i < LOOP_COUNT; i++) {
            calculate();
        }
    }
    
    static void calculate() {
        double value = Math.random() * Math.random();
    }
}

Executing this code with the PrintCompilation flag, and watching for JIT compilation event for calculate() method, we see the compilation event log in the output as shown below 

> java -XX:+PrintCompilation Main1 | Select-String -Pattern calculate

     71   67       3       Main1::calculate (9 bytes)
     
>

This indicates that our method calculate() is JIT compiled this time. 

What happens if we increase the loop count still further? We will try to increase the loop count to 1M this time. Code now is as shown below

public class Main2 {
    
    static final int LOOP_COUNT = 1000 * 1000; //1M

    public static void main(String[] args) {

        for (int i = 0; i < LOOP_COUNT; i++) {
            calculate();
        }
    }

    static void calculate() {
        double value = Math.random() * Math.random();
    }
}
 

Executing this code again with PrintCompilation flag, we new see multiple JIT compilation event logs  for our calculate() method  

> java -XX:+PrintCompilation Main2 | Select-String -Pattern calculate
     76   68       3       Main::calculate (9 bytes)
     79   72       4       Main::calculate (9 bytes)
     81   68       3       Main::calculate (9 bytes)   made not entrant

>

Why is the JIT compilation kicking in only when loop count is high and why we we seeing multiple JIT compilation events occurring when loop count is very high? We will explore that in a subsequent post. 

Before that we will see how to read the output generated by PrintCompilation flag in our next post.


JIT Compiler

Lets start by talking a bit about JIT compiler...

This is an age old topic that has been widely discussed. There are a lot of materials out there explaining JIT compilation in much greater detail.

But I am bent on telling it one more time... the way I have understood it... put in as simply as I can!!!

So lets start with this universal Hello World! code

public class Main {

    public static void main(String[] args) {
        System.out.println("Hello World!");
    }
}

To execute this program, we execute two commands

  1. javac command to compile this source code into a class file
  2. java command to execute the class file
javac Main.java

java Main

The first command converts the source code into set of JVM instructions, commonly referred to as Java byte codes. These byte codes are stored in a .class file.

The second command starts a JVM instance, reads the byte codes from the class file and executes them on the JVM to produce the desired output.  

Now the JVM itself is a virtual layer on top of the hardware on which we execute the java command. What JVM actually does is interpret the byte codes and convert it (compile it) into assembly language instruction set that can be executed on the specific hardware.  

So the JVM, at a high level performs the following steps for each byte code instruction

  1. Read that byte code instruction
  2. Interpret that byte code and compile it to generate the equivalent assembly language instructions for the specific hardware on which its getting executed 
  3. And finally get these generated assembly language instructions executed on that hardware

This might just be fine for a simple program like hello world, but the real world programs are much more complex. 

Consider for instance, the below example where we print the String "Hello World" from within the method hello(). This method is called 10,000 times over from the main method

public class Main1 {

    public static void main(String[] args) {

        for (int i = 0; i < 10000; i++) {
            hello();
        }
    }
    
    static void hello() {
        System.out.println("Hello World!");
    }
}

The JVM performing the cycle of read -> interpret -> execute 10,000 times would sure be not an efficient approach. 

The compiled instruction set that gets generated is going to be the same for each of the 10,000 cycles. Java byte code need not be interpreted each time the JVM loops through. 

The component that does this compilation is the Just In-time Compiler or the JIT compiler and it is executed as part of the JVM process. 

So, 

  • javac is the static compiler that converts java source files into byte code instruction set
  • JIT compiler is part of the JVM process that is started by the java command and it performs dynamic compilation of byte code instruction set to native assembly language instructions


Monday, March 08, 2021

Variable declaration inside if statement

Took a while for me to figure this out... 

Consider the below piece of code

public class Main {

    public static void main(String[] args) {

        boolean flag = true;
        
        if(flag)
            String message = "I will not compile";
        
        if(flag) {
            String message = "Here I am ok...";
        }
    }
}

In this code, the first if statement does not compile. Second if statement is all fine. 

Only difference here is 

- in the first there are no curly braces surrounding the if statement. 

- in the second one, we have the curly braces surrounding the if statement.

Compiling this code throws the error message "variable declaration not allowed here"

Main.java:7: error: variable declaration not allowed here
                        String message = "I will not compile";

So the variable declaration is not allowed inside the if statement, when it is not surrounded by curly braces.

But why is the variable declaration not allowed when we do not have curly braces? 

Is it not perfectly fine to avoid curly braces when we have only one statement inside the if block?

The first reason I could think of is: 

Variable declaration and value assignment to that variable are considered as two statements by the compiler though its declared in a single line. Since only one line is allowed inside of an if statement when its not surrounded by braces, the compiler rejects it. 

This argument made some sense, but then why not the compiler just take the 1st statement as contained within the if block and the 2nd as contained outside of the if block even when variable declaration and value assignment are considered as two separate statements?

This doesn't make sense as then we have variable declaration inside of the if block and assignment of value to that variable outside of if block where that variable is not visible.  

So the compiler rejecting the code for the reason that a single line of code like 

String message = "I will not compile";

actually represents two statements as shown below is on expected lines.

String message;
message = "I will not compile";

Except for one catch. What if we only do variable declaration inside the if block, that is not surrounded by braces. 

Of course that variable will not be of any use. But so is the case of our original example - declaring a variable and assigning a value to it is as well of no use, unless it is used somewhere isn't it?

So lets try that out

public class Main1 {

    public static void main(String[] args) {

        boolean flag = true;
        
        if(flag)
            String message; // Compilation error here
        
        if(flag) {
            String message; // Allowed
        }

    }
}

We have made the statement inside if block as atomic as possible - having only a variable declaration without assigning any value to it. But here again, the one without curly braces throws compilation error while the other with curly braces compiles fine.  

OK... Time for some searching around...

And it turns out that the scope of a variable declared within if statement not surrounded by braces - is the scope in which if statement itself is (i.e, the surrounding scope of the if statement). 

Since the variable is declared conditionally - that the variable will be defined if and only if the if statement evaluates to true (too many if's... :)). This variable is not guaranteed to be available in the surrounding scope in all scenarios. Hence the compiler does not allow this statement and throws an error.

And for the same reason, variable declaration is not allowed in looing statements as well, when the body of the statement is not surrounded by curly braces. 

Below code shows all the error scenarios

public class Main2 {

    public static void main(String[] args) {

        boolean flag = true;
        
        if(flag)
            String message = "I will not compile";
        
        for(; flag ;)
            String message = "I will not compile";
            
        while(flag)
            String message = "I will not compile";
    }
}


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Curious-Cases/Variable-declaration-inside-if-statement

Saturday, March 06, 2021

Using -Xlint:text-blocks compiler option

Consider this piece of code:

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                The woods are lovely, dark and deep,        
                But I have promises to keep,      
                And miles to go before I sleep, 
                And miles to go before I sleep.    
                """;
        System.out.println(poemTextBlock);
    
    }
}

This seemingly perfect code when executed produces the following output




The indentation and white spaces included within the string produced by this text block is not what could have been intended. 

To help identifying this not so obviously visible issue, -Xlint:text-blocks compiler option was introduced.

When compiling the code with this option, it throws out warning messages highlight issues with white spaces used within the text block. 

It specifically shows these two warning messages

  • inconsistent white space indentation - shown if there is inconsistency in the incidental white space characters across the lines within text block
  • trailing white space will be removed - shown if a trailing space is present in any of the lines within the text block that would stripped off

Try compiling the above program with -Xlint:text-blocks flag included as in the command below

javac -Xlint:text-blocks Main.java

This gives the two warning messages, as shown below 

Main.java:5: warning: [text-blocks] inconsistent white space indentation
                String poemTextBlock = """
                                       ^
Main.java:5: warning: [text-blocks] trailing white space will be removed
                String poemTextBlock = """
                                       ^
2 warnings



Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Using--Xlinttext-blocks-compiler-option

New escape sequence - \

This new escape sequence "\<line terminator>" can be used when we do not want to include a new line character at the end of a line within a text block. 

When used, this escape sequence effectively suppresses the new line character that gets implicitly included at the end of that line.

Below code shows the usage of this escape sequence

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                The woods are lovely, dark and deep, \
                But I have promises to keep,
                And miles to go before I sleep, \
                And miles to go before I sleep.
                """;
        System.out.println(poemTextBlock);
    
    }
}

In this code, we have used the "\<line terminator>" escape sequence on 1st and 3rd lines within the text block. 

This suppresses the new line character on the 1st and 3rd lines and produces the below output


When using this, take care to ensure that the "\" at the end of the line is immediately followed by the line terminator without leaving any blank spaces after the "\". 

Leaving a blank space at the end accidentally will throw a compilation error stating "illegal escape character"

 

Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/New-escape-sequence-2

New escape sequence - \s

Two new escape sequences got introduced with text blocks

  • \s
  • \<line terminator> 

First lets see how to use "\s"

"\s" is the escape sequence for adding a space character within the string. It can be used in both regular strings and in text blocks. 

Below code shows the usage of "\s" when used within a text block and a regular string. 

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                The\swoods\sare\slovely,\sdark\sand\sdeep\s""";
        String poemString = "The\swoods\sare\slovely,\sdark\sand\sdeep\s";

        System.out.println(poemTextBlock);
        System.out.println(poemString);
    
    }
}


Strings produced by both this text block and the regular string expression are the same. The output is shown below 




We can see that each of the "\s" is replaced by a single space, including one at the end of the line (Remember trailing spaces at the end of the line gets stripped off in text blocks, but not when escape sequence equivalent is used for providing space)

"\s" can be used to include trailing spaces within text block, The escape sequence approach or fencing approach can be used with "\s" to include the needed spaces as explained here. All the techniques given in this post for the usage of octal escape sequence "\040" can be applied with "\s" as well.

We will see the other new escape sequence introduced text blocks - \<line terminator> in the next post


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/New-escape-sequence-1

Techniques for including trailing whitespaces into text blocks

 Trailing whitespaces can be included in a text block using one of the following approaches

Character substitution 

Here we include a special character in the text block for trailing whitespaces and replace them with space after the text block is processed by the compiler. Code for this shown below

public class CharacterSubstitution {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,###
                            But I have promises to keep,###
                            And miles to go before I sleep,###
                            And miles to go before I sleep.###
                        </pre>
                    </body>
                </html>""".replace('#',' ');
    
        System.out.println(poemTextBlock);
    }
}

Character fencing

Here we including the needed trailing spaces. But instead of ending the line with the space, include a special fence character at the end so that the spaces are not considered trailing spaces and hence are not stripped away. 

We remove this fence character after the text block is processed using the replace method as shown in the code below

public class CharacterFencing {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,   #
                            But I have promises to keep,   #
                            And miles to go before I sleep,   #
                            And miles to go before I sleep.   #
                        </pre>
                    </body>
                </html>""".replace("#\n","\n");
    
        System.out.println(poemTextBlock);
    }
}

Escape sequence for space:

Here we use the octal escape sequence for space, in the text block where we need trailing spaces. Sample code for this shown below

public class EscapeSequence {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,\040\040\040
                            But I have promises to keep,\040\040\040
                            And miles to go before I sleep,\040\040\040
                            And miles to go before I sleep.\040\040\040
                        </pre>
                    </body>
                </html>""";
    
        System.out.println(poemTextBlock);
    }
}

Note that unicode escape sequence for space cannot be used as they are translated prior to lexical analysis where as octal escape sequence gets processed after lexical analysis. 

What exactly happens if we use unicode escape sequence inside of text block? That's a topic to explore in a separate post. 

Since the escape sequences gets processed later in the processing, octal escape sequence for space can be used as a fencing character to include trailing blank spaces. Here we do not have to replace the fencing character as it is also a space character that we want to include. 

The below code shows this. Here we use two regular white space characters followed by a octal whitespace escape sequence to include one more additional whitespace. 
public class EscapeSequence1 {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,  \040
                            But I have promises to keep,  \040
                            And miles to go before I sleep,  \040
                            And miles to go before I sleep.  \040
                        </pre>
                    </body>
                </html>""";
    
        System.out.println(poemTextBlock);
    }
}

The output here includes three whitespaces at the end lines 4 to 7. 







So far, we have used only space character in all our examples. But tab character also represent whitespace and they are widely used for code indentation and formatting. How does the tab character behave when used within a text block? We will explore that in our next post.


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Techniques-for-including-trailing-whitespaces

System.out.println() vs. System.out.print("\n")

There is a subtle difference between these two statements

System.out.println()

and

System.out.print("\n")

Though on the surface they both seem to be doing the same thing and fact they are doing the same thing - print a new line to the console, there is a subtle difference between the two that is worth taking note of. 

System.out.print("\n"): Always prints "\n" to the console. This is the platform neutral way of printing a new line character to the console. 

System.out.println():  Prints platform specific new line character to the console, which is different for different OS. On windows, it prints "\r\n". On linux it prints "\n" and so on... 

This code demonstrates this

public class Main {

    public static void main(String[] args) {

        System.out.println();
        System.out.print("\n");
    }
}

It produces the below output on my windows laptop



System.out.println() is equal to System.out.print(System.lineSeparator()) - both of which produces the same output - printing platform specific new line character to the console

public class Main {

    public static void main(String[] args) {

        System.out.println();
        System.out.print(System.lineSeparator());
    }
}

This code produces the output for both the print statements




Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Curious-Cases/println()-vs.-print()

Escape sequence in text blocks - \"

So we want to include three double quotes in the string contained within a text block. 

Say if we want the processed string to be as the one shown here

The woods are """lovely, dark and deep,
But I have """promises to keep,
And miles to go """before I sleep,
And miles to go """before I sleep.

We can use three double quotes with escape sequence like \""". Here \""" is not a new escape sequence. In fact, the escape sequence characters here is only \" - the escape sequence for double quote. The next two double quotes are the actual characters included in the string. 

The escaped double quote can be used for any of the three double quotes. The below code shows this. This code produces the same output string that is shown above.

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                The woods are \"""lovely, dark and deep,
                But I have "\""promises to keep,
                And miles to go ""\"before I sleep,
                And miles to go \"\"\"before I sleep.
                """;
        System.out.println("Text block: \n"+poemTextBlock);

    }
}

In this program, we escape the double quote at different positions in each line and for the last line, we use escape sequence for each of the three double quote characters. 

Where we need to include three or more continuous double quotes within a text block, we will have to use escape sequence so as to avoid having three continuous double quotes which will end the text block.

Below code includes five continuous double quote before the word 'lovely'

        String poemTextBlock = """
                The woods are \"""\""lovely, dark and deep,
                """;



Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Escape-sequences-in-text-blocks-2

Friday, March 05, 2021

Escape sequences in text blocks - '\n'

All the escape sequences that can be used with the String, can be used in text blocks as well. 

But some escape sequences are not required to be used within text blocks. The actual character can be used directly instead. The most common escape sequence that can and should be avoided where possible in text blocks is the new line character '\n'.

At the end of each line of the multi-line string literal represented by the text block, new line character '\n' is included by Java compiler when processing the text block. 

We saw many examples of this in the previous posts.

There are a few tricks that we need to be aware of. First, lets see what happens if we explicitly include '\n' at the end of each line inside the text block

Below code shows a regular text block and the same with '\n' included at the end of each line 

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                The woods are lovely, dark and deep,
                But I have promises to keep,
                And miles to go before I sleep,
                And miles to go before I sleep.
                """;

        String poemTextBlockWithNewLine = """
                The woods are lovely, dark and deep,\n
                But I have promises to keep,\n
                And miles to go before I sleep,\n
                And miles to go before I sleep.
                """;

        System.out.println("Text block: \n"+poemTextBlock);
        System.out.println("Text block with new line: \n"+poemTextBlockWithNewLine);

    }
}

The '\n' at the end of each line introduces an additional new line between each of the lines and produces the output shown below


A new line character '\n' is not a whitespace and is not stripped away by the Java compiler when processing the text block.

Now consider the below snippet of code.  

Here we have two '\n' on the first line of text block, with a tab included in between.  

How does this get processed? There are four leading tab spaces in each of the lines of the text block, but between the two new line characters there is just one tab space. Does this impact the indentation of the lines making the text block. More specifically, will this modify the position of left margin for incidental whitespace stripping? 

The above text block when printed has a value shown below


As you can see, the left margin is not affected by the whitespaces included between '\n' characters. Also note the presence of tab characters at the beginning of the 2nd line, indicating it is preserved and has not got stripped away in the processing. 

This is because the escape translation happens as a last step in the compilation process. 

The left margin gets identified and incidental whitespaces gets stripped away before '\n' escape sequence gets processed in our example. Escape sequence processing happens as a last step, making the '\n' characters include additional line feeds within the string. 

This is also the reason why we may have to use '\n' explicitly - to include empty line with specific no. of whitespace characters without impacting the margin of other lines within a text block. 

There is no other way of doing this when defining a text block.   


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Escape-sequences-in-text-blocks-1

Thursday, March 04, 2021

Normalization of platform specific line terminator characters

Line terminator character is platform specific. 

You can find the line termination character for your platform using System.lineSeparator()API call

On my windows machine, I get "\r\n" as the line termination character. 

jshell> System.lineSeparator()
$1 ==> "\r\n"

Unix & Linux uses "\n" as line termination character & some older versions of Mac OS uses "\r" as line termination character.

This poses a few issues with handling multi-line string literals represented by text blocks

  • Some editor used may automatically change the line termination character
  • When the source file gets edited on different platforms, there is a chance of getting different line termination characters getting used within the same text block. 

To avoid these issues, Java compiler normalizes line termination character inside the multi-line string literal in text blocks to '\n' while processing. So, all the different line termination characters "\r", "\r\n" and "\n" becomes "\n" after processing a text block.

Let us check this with the below program:


Code is shown as an image to make the line termination characters visible. Here the line termination character is "\r\n" represented by CR|LF

This produces the following output

Comparing with \n:true
Comparing with \r\n:false

indicating that \r\n line termination character in the source code is converted to \n after the text block is processed.

Below is the version of the code for copy/pasting if needed. 

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                 And miles to go before I sleep.
                 """;
 
        System.out.println("Comparing with \\n:" + "And miles to go before I sleep.\n".equals(poemTextBlock));
        System.out.println("Comparing with \\r\\n:" + "And miles to go before I sleep.\r\n".equals(poemTextBlock));
    }
}

In the next post, we will see if and how of using escape sequences within text blocks


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Normalization-of-platform-specific-line-terminator-characters

Wednesday, March 03, 2021

Tab character in text blocks

The problem with tab character is that there is no standard definition for how many spaces it represents. Different editors use different no. of spaces to display the tab character and most editors give this choice to the users to configure it as per his/her preference.

For this reason, Java treats tab characters as of size 1 when processing the text block. Irrespective of any no. of spaces that a tab may occupy when displayed in your editor, when the text block gets processed by the Java compiler, its whitespace size is counted as 1.

Consider the code snippet shown below 


Tab character here is represented by a long arrow ---->

In the editor I use, tab character occupies 4 spaces. When looking at this code in the editor, it has a well indented text block defined, as shown below:

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,
                            But I have promises to keep,
                            And miles to go before I sleep,
                            And miles to go before I sleep.
                        </pre>
                    </body>
                </html>""";

        System.out.println(poemTextBlock);
    }
}

But when executed, the output of the above program is


Since Java compiler considers only one character size for tabs, lines 4-7 contains only 7 leading whitespace characters (7 tab characters). 

The visibly least indented line 1 contains 16 whitespace characters (16 space characters). 

To the Java compiler, lines 4-7 are least indented and hence the start of these lines is taken as the left margin. 

Note that the Java compiler does not convert the tab character into a space character when processing. It only counts the size of the whitespaces represented by the tab character as 1 for the purpose of fixing the left margin and stripping away the incidental white spaces from text block.

Below code demonstrates this


This code has different no. of tabs for each of the lines 5, 6, 7 & 8. 

Line 5 has the least no. of tabs and is taken as the left margin by the Java compiler. 

When this text block is printed, it produces the below output, indicating that the tab characters are preserved and when printed on console, they are represented by as many spaces as per the console configuration


Full code for the above example from my editor for your reference. 

public class Main1 {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,
                                But I have promises to keep,
                                    And miles to go before I sleep,
                                        And miles to go before I sleep.
                        </pre>
                    </body>
                </html>""";

        System.out.println(poemTextBlock);
    }
}


Note: While copy/pasting the code samples from this post, you might have to edit the code to make sure that tab and space characters are correctly pasted for you to see it working as explained here. 


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Tab-character-in-text-blocks


Tuesday, March 02, 2021

Techniques for including leading white spaces into text blocks

We saw one approach for controlling leading indentation by moving the ending three double quotes position as required. 

But a scenario in which this approach would not work is when we do not want a new line after the last line of the text block. The code then becomes

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,
                            But I have promises to keep,
                            And miles to go before I sleep,
                            And miles to go before I sleep.
                        </pre>
                    </body>
                </html>""";

Here we will not be able to use the position of """ to dictate leading indentation required. 

We will have to use the indent() method on string to provide the necessary indentation. Code for this  shown in the below sample below

public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,
                            But I have promises to keep,
                            And miles to go before I sleep,
                            And miles to go before I sleep.
                        </pre>
                    </body>
                </html>""".indent(8);
    
        System.out.println(poemTextBlock);
    }
}


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Techniques-for-including-leading-white-spaces


Incidental and essential white spaces in text blocks

Consider the below piece of code containing a text block declaration

String poemTextBlock = """               
    <html>
        <body>
            <pre>
                The woods are lovely, dark and deep,   
                But I have promises to keep,   
                And miles to go before I sleep,   
                And miles to go before I sleep.
            </pre>
        </body>
    </html>
    """;

From this, can we infer how the string in the text block gets formatted for spaces. 

Would all leading spaces get into it? 

And what about trailing spaces if there are any, included at the end of some of the lines?

Lets first see what and how the spaces are contained within the above block of code. 

For that, we will refer to the below screen grab from the editor, with whitespace visibility set to 'Yes'


Spaces are indicated by dot (.) and end of line by CRLF 

Note that there are some trailing spaces on lines 5, 6 & 7.

So with all these leading and trailing spaces, how does the text block format the string contained in it?

We do not want it to retain all the spaces as-is

  • Trailing spaces may not be intentional and they are not even visible in the editors to check and correct. 
  • A part of the leading spaces were introduced just to align with the indentation of the surrounding code. Changing indentation of the code would result in the content of the text block getting changed.

And we do not want it to simply remove all leading and trailing spaces either. 

This would make the final string to be as shown below without indentation, which definitely is not what we want. 



To understand how Java handles leading & trailing whitespaces contained within text block, lets first check what are incidental and essential whitespaces  

Incidental whitespace
These are whitespaces that are 
  • To the left of the least intended line within the text block
  • All the trailing whitespaces on each line

Essential whitespace
Leading whitespaces on each line that are not incidental are essential whitespaces. They are essential for providing the indentation of the text contained within the text block. 

Incidental whitespaces are stripped away and essential whitespaces are retained by the Java compiler when processing a text block. 

This makes the text block represented in the above code, formatted as shown below after processing


Note that all the incidental whitespaces are removed but essential white spaces are retained in the processed text. 

Full code to test this sample shown below: 
public class Main {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,
                            But I have promises to keep,
                            And miles to go before I sleep,
                            And miles to go before I sleep.
                        </pre>
                    </body>
                </html>
                """;
        System.out.println(poemTextBlock);
    }
}

Note that the ending three double quotes is also considered when establishing the left margin for incidental whitespace

Below code fragment would produce a string with all leading white spaces included as the ending three double quotes are aligned to the left most margin

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,
                            But I have promises to keep,
                            And miles to go before I sleep,
                            And miles to go before I sleep.
                        </pre>
                    </body>
                </html>
        """;

Above code will produce a string that is formatted as shown below, with the leading spaces included


We have seen in this post, how Java handles leading and trailing whitespaces by stripping away the incidental whitespaces. But what if we want to include leading or trailing whitespaces into text blocks? We will see how to do that in the next post.

 

Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Incidental-and-essential-white-spaces

Monday, March 01, 2021

Text Block Syntax - Deep Dive

We mentioned in the previous post about the subtle difference between the string formed through Text Block syntax and the one formed through regular String syntax. 

Lets examine that through this sample code

public class Main {

    public static void main(String[] args) {
        
        String poemTextBlock = """
                The woods are lovely, dark and deep,   
                But I have promises to keep,   
                And miles to go before I sleep,   
                And miles to go before I sleep.
                """;
        
        String poemString = "The woods are lovely, dark and deep,\n"
                + "But I have promises to keep,\n"
                + "And miles to go before I sleep,\n"
                + "And miles to go before I sleep.";
        
        System.out.println("Text Block: "+poemTextBlock);
        System.out.println("String: "+poemString);
        
        if (poemString.equals(poemTextBlock)) {
            System.out.println("Textblock and String are equal");
        } else {
            System.out.println("Textblock and String are NOT equal");
        }

    }
}.

Here we are comparing the two strings and printing if they are equal or not. 

Output of this code is 

Text Block: The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep.

String: The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep.
Textblock and String are NOT equal

Yes. They are not equal. It is due to the new line at the end of the string formed by the Text Block. 

The strings will show as equal when we form the text block string, with the ending three double quotes on the same line as shown below

        String poemTextBlock = """
                The woods are lovely, dark and deep,   
                But I have promises to keep,   
                And miles to go before I sleep,   
                And miles to go before I sleep.""";

This avoids introducing a line terminator (\n) character to the last line of the string. 

But then, what about the first line. Wouldn't the code above introduce a \n before the first line? 

Turns out that the three double-quote and a line terminator marks the beginning of the text block. We cannot start a text block content in the same line as that of the beginning three double quotes. 

Below code would throw a compile time error

        String poemTextBlock = """The woods are lovely, dark and deep,   
                But I have promises to keep,   
                And miles to go before I sleep,   
                And miles to go before I sleep.""";

So a text block begins from the next line of the beginning three double-quotes. And the placement of ending three double-quotes on the same line as the last line of the text block avoids adding a line terminator to the last line. 

What about the spaces used for indentation of the text block content? Would those spaces become part of the text block content? 

We see from the code above that they have not been part of the content in this example. We will explore how the indentation behaves in the text block in our next post.

 

Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Text-Block-Syntax

Sunday, February 28, 2021

The Text Blocks

Java 13 introduced Text Blocks as a preview feature. It got enhanced and continued as a preview feature in Java 14 release. Java 15 has made this a permanent feature of the language. 

So what is a text block?

Short answer is it is a multi-line string with special syntax. The purpose of introducing it is to simplify and minimize the code required to create a multi-line strings.

Assume for example, we want to capture this famous last stanza of Robert Frost's poem as a Java String. 

The woods are lovely, dark and deep,   
But I have promises to keep,   
And miles to go before I sleep,   
And miles to go before I sleep.

Below code snippet captures this

        String poemString = "The woods are lovely, dark and deep,\n"
                + "But I have promises to keep,\n"
                + "And miles to go before I sleep,\n"
                + "And miles to go before I sleep.";

But this has multiple strings appended through + operator. Also note the '\n' included at the end of each line to capture the new line into the String. 

Things would get more complicated for lengthy string literals, which is not uncommon with HTMLs,  XMLs & JSONs often getting embedded in code. And not to forget the performance implication of many String appends leading to usage of StringBuilder and StringBuffer for optimization. 

Wouldn't it be better if we can get rid of this and have a simpler syntax to capture such multi-line strings? 

This essentially is what the Text Block feature provides. 

Below snippet shows how to use a Text Block to capture the same String

        String poemTextBlock = """
                The woods are lovely, dark and deep,   
                But I have promises to keep,   
                And miles to go before I sleep,   
                And miles to go before I sleep.
                """;

We start and end a Text Block with three double quotes as in """. Format of the text inside of the Text Block is preserved and there is no need to add additional '\n' for the new lines. 

This will greatly simplify the code when embedding HTML, XML, JSON, SQL, JavaScript or other such code snippet in a string literal. 

The full code is embedded below

public class Main {

    public static void main(String[] args) {
        
        String poemTextBlock = """
                The woods are lovely, dark and deep,   
                But I have promises to keep,   
                And miles to go before I sleep,   
                And miles to go before I sleep.
                """;
        
        String poemString = "The woods are lovely, dark and deep,\n"
                + "But I have promises to keep,\n"
                + "And miles to go before I sleep,\n"
                + "And miles to go before I sleep.";
        
        System.out.println("Text Block: "+poemTextBlock);
        System.out.println("String: "+poemString);

    }
}

Note that there is a minor difference in the values represented by poemString & poemTextBlock in the above code.

Try to figure out that difference... We will uncover that in our next post.


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/02/Language-Features/Text-blocks/The-Text-Blocks

Using binding variable in the expression of if statement

Yes. We can use the binding variable inside of an if expression as long as its assured that when the code flow reaches that expression its assured that the instanceof operator has evaluated to true

Below sample code illustrates this


interface Pet {
    default public String color() {
        return this.color();
    }
}

record Dog(String color) implements Pet {

    public void bark() {
        System.out.println("I bark...");
    }
}

record Cat(String color) implements Pet {

    public void meaw() {
        System.out.println("I meaw...");
    }
}

public class Main {

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        if (p1 instanceof Dog d && d.color().equals("White") && 
                p2 instanceof Cat c && c.color().equals("White")) {
            d.bark();
            c.meaw();
        }
    }
}

Here we are using the binding variables c & d inside of an if expression. Note that the usage is allowed when used with short-circuit && operator, which makes sure the subsequent expression gets evaluated only when the 1st condition evaluates to true. 


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/02/Language-Features/Instanceof-Binding-Variable/In-if-expression

Binding variable scope with NOT (!) condition

Let us introduce a NOT (!) operator to the if condition and study the scope of the binding variable in this case. 

Full sample code with NOT condition shown below  


interface Pet {
    default public String color() {
        return this.color();
    }

}

record Dog(String color) implements Pet {

    public void bark() {
        System.out.println("I bark...");
    }
}

record Cat(String color) implements Pet {

    public void meaw() {
        System.out.println("I meaw...");
    }
}

public class Main {

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        if (!(p1 instanceof Dog d)) {
            return;
        }
        
        if (!(p2 instanceof Cat c)) {
            return;
        }

        d.bark();
        c.meaw();
    }
}

This code compiles and executes as expected. 

Here binding variables are not accessible inside of the if block because of the not condition and are accessible outside of the if blocks.  

If you observe the code carefully, we are returning immediately if p1 is not Dog and p2 is not Cat. Code flow reaches beyond the if statements if and only if d is Dog and c is Cat. 

It is this unconditional return statement inside of the if block that makes this code work. Had the main method been like this instead, it will not compile

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        if (!(p1 instanceof Dog d)) {
            System.out.println("p1 is not Dog");
        }
        
        if (!(p2 instanceof Cat c)) {
            System.out.println("p2 is not Cat");
        }

        d.bark();
        c.meaw();
    }

This code will not compile as the flow would reach d.bark() and c.meaw() even when d and c are not Dog & Cat respectively.

Finally in the below code, we use the binding variable in the else part of the if condition with NOT (!) operator. Here, its assured that d and c are Dog and Cat when the flow reaches their respective else blocks 

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        if (!(p1 instanceof Dog d)) {
            System.out.println("p1 is not Dog");
        } else {
            d.bark();
        }
        
        if (!(p2 instanceof Cat c)) {
            System.out.println("p2 is not Cat");
        } else {
            c.meaw();
        }
    }

This will compile and execute as expected.

Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/02/Language-Features/Instanceof-Binding-Variable/With-NOT-condition

Scoping of binding variables

Binding variables of instanceof operator is visible only inside of the code blocks that are reachable when the instanceof operator evaluates to true. 

Below sample illustrates this.


interface Pet {
    default public String color() {
        return this.color();
    }

}

record Dog(String color) implements Pet {

    public void bark() {
        System.out.println("I bark...");
    }
}

record Cat(String color) implements Pet {

    public void meaw() {
        System.out.println("I meaw...");
    }
}

public class Main {

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        if (p1 instanceof Dog d) {
            d.bark();
        }

        if (p2 instanceof Cat c) {
            c.meaw();
        }

        d.bark(); 
        c.meaw(); // Will not compile. c & d are not visible beyond their respective if blocks
    }
}

In the above sample, lines c.meaw()  and d.bark() outside of the if statement will not compile, as at this point c & d are not visible. 

Now we will make a few changes to the above code to study the scope of binding variables in-depth. 

First lets try to move the instanceof operator with binding variable outside of the if block. The main method is changed as shown below

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        boolean isDog = p1 instanceof Dog d;
        boolean isCat = p2 instanceof Cat c;

        d.bark();
        c.meaw();

    }

In the code above, it looks like c & d should be visible through the rest of the main() method. But this code also fails to compile. Binding variables c & d are not visible beyond the line containing their respective instanceof operator 

The code block that is reachable, when the instanceof operator evaluates to true is empty and its scope ends with assigning its result to the boolean variable. The code lines d.bark() and c.meaw() gets executed irrespective of if the instance of operator evaluates to true of false 

This explains why the binding variables are not visible beyond their respective instanceof operator statements. Easy!!! Below scenario gets a bit more tricky

Now lets modify the code to use both the instanceof binding inside of an if block, combining the two instance of operator with a logical & operator as shown below

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        if (p1 instanceof Dog d & p2 instanceof Cat c) {
            d.bark();
            c.meaw();
        }
    }

Phew... the code still doesn't compile. Binding variables c and d are not visible inside of the if block :(

Since the logical AND (&) operator is not short-circuiting and the second condition gets executed even which the first check evaluates to false, the scope of the binding variable is not extended beyond the instanceof operator statement and hence not visible inside of the if block!!! Bit tricky, but thats how the scope of the binding variable is...

And now, lets try with the short circuit AND operator (&&) as shown below

    public static void main(String[] args) {

        Pet p1 = new Dog("White");
        Pet p2 = new Cat("White");

        if (p1 instanceof Dog d && p2 instanceof Cat c) {
            d.bark();
            c.meaw();
        }
    }

This time, code compiles and executes as expected. Every piece of code beyond the instanceof binding variable assignment gets executed if and only if the instance of operator evaluates to true. This makes the binding variables c and d visible inside of the if block. 

And there is another interesting case on the scope of binding variable when used with the NOT (!) operator. We will explore that in the next post.


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/02/Language-Features/Instanceof-Binding-Variable/Scoping