Saturday, March 06, 2021

Techniques for including trailing whitespaces into text blocks

 Trailing whitespaces can be included in a text block using one of the following approaches

Character substitution 

Here we include a special character in the text block for trailing whitespaces and replace them with space after the text block is processed by the compiler. Code for this shown below

public class CharacterSubstitution {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,###
                            But I have promises to keep,###
                            And miles to go before I sleep,###
                            And miles to go before I sleep.###
                        </pre>
                    </body>
                </html>""".replace('#',' ');
    
        System.out.println(poemTextBlock);
    }
}

Character fencing

Here we including the needed trailing spaces. But instead of ending the line with the space, include a special fence character at the end so that the spaces are not considered trailing spaces and hence are not stripped away. 

We remove this fence character after the text block is processed using the replace method as shown in the code below

public class CharacterFencing {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,   #
                            But I have promises to keep,   #
                            And miles to go before I sleep,   #
                            And miles to go before I sleep.   #
                        </pre>
                    </body>
                </html>""".replace("#\n","\n");
    
        System.out.println(poemTextBlock);
    }
}

Escape sequence for space:

Here we use the octal escape sequence for space, in the text block where we need trailing spaces. Sample code for this shown below

public class EscapeSequence {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,\040\040\040
                            But I have promises to keep,\040\040\040
                            And miles to go before I sleep,\040\040\040
                            And miles to go before I sleep.\040\040\040
                        </pre>
                    </body>
                </html>""";
    
        System.out.println(poemTextBlock);
    }
}

Note that unicode escape sequence for space cannot be used as they are translated prior to lexical analysis where as octal escape sequence gets processed after lexical analysis. 

What exactly happens if we use unicode escape sequence inside of text block? That's a topic to explore in a separate post. 

Since the escape sequences gets processed later in the processing, octal escape sequence for space can be used as a fencing character to include trailing blank spaces. Here we do not have to replace the fencing character as it is also a space character that we want to include. 

The below code shows this. Here we use two regular white space characters followed by a octal whitespace escape sequence to include one more additional whitespace. 
public class EscapeSequence1 {

    public static void main(String[] args) {

        String poemTextBlock = """
                <html>
                    <body>
                        <pre>
                            The woods are lovely, dark and deep,  \040
                            But I have promises to keep,  \040
                            And miles to go before I sleep,  \040
                            And miles to go before I sleep.  \040
                        </pre>
                    </body>
                </html>""";
    
        System.out.println(poemTextBlock);
    }
}

The output here includes three whitespaces at the end lines 4 to 7. 







So far, we have used only space character in all our examples. But tab character also represent whitespace and they are widely used for code indentation and formatting. How does the tab character behave when used within a text block? We will explore that in our next post.


Sample code used in this post can be downloaded from https://github.com/ashokkumarta/awesomely-java/tree/main/2021/03/Language-Features/Text-blocks/Techniques-for-including-trailing-whitespaces

No comments:

Post a Comment