Java - Regular Expressions

Welcome, aspiring Java programmers! Today, we're diving into the fascinating world of Regular Expressions (Regex) in Java. Don't worry if you're new to programming; I'll guide you through this journey step by step, just as I've done for countless students over my years of teaching. So, grab a cup of coffee, and let's embark on this exciting adventure together!

Java - Regular Expressions

What are Regular Expressions?

Before we jump into the Java-specific implementation, let's understand what Regular Expressions are. Imagine you're a detective trying to find a specific pattern in a sea of text. That's exactly what Regex does – it's a powerful tool for pattern matching and manipulation of strings.

Java Regular Expressions (Regex) Classes

Java provides a package called java.util.regex that contains several classes for working with regular expressions. The three main classes we'll focus on are:

  1. Pattern
  2. Matcher
  3. PatternSyntaxException

Let's explore each of these in detail.

Pattern Class

The Pattern class represents a compiled representation of a regular expression. Think of it as the blueprint for your pattern-matching detective work.

import java.util.regex.Pattern;

public class PatternExample {
    public static void main(String[] args) {
        Pattern pattern = Pattern.compile("Hello");
        System.out.println("Pattern created: " + pattern);
    }
}

In this example, we're creating a simple pattern that matches the word "Hello". The compile() method takes the regular expression as a string and returns a Pattern object.

Matcher Class

The Matcher class is where the real magic happens. It performs match operations on a character sequence based on a Pattern.

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class MatcherExample {
    public static void main(String[] args) {
        String text = "Hello, World! Hello, Java!";
        Pattern pattern = Pattern.compile("Hello");
        Matcher matcher = pattern.matcher(text);

        while (matcher.find()) {
            System.out.println("Found match at index: " + matcher.start());
        }
    }
}

This code searches for the pattern "Hello" in the given text and prints the starting index of each match. It's like our detective finding all occurrences of a clue in a document!

Regular Expression Syntax

Now, let's learn some basic syntax for creating more complex patterns. Here's a table of commonly used metacharacters:

Metacharacter Description
. Matches any single character
^ Matches the beginning of the line
$ Matches the end of the line
* Matches zero or more occurrences
+ Matches one or more occurrences
? Matches zero or one occurrence
\d Matches a digit
\s Matches a whitespace character
\w Matches a word character (letter, digit, or underscore)

Let's see some of these in action:

public class RegexSyntaxExample {
    public static void main(String[] args) {
        String text = "The quick brown fox jumps over the lazy dog";

        // Match words starting with 'q'
        Pattern pattern1 = Pattern.compile("\\bq\\w+");
        Matcher matcher1 = pattern1.matcher(text);
        if (matcher1.find()) {
            System.out.println("Word starting with 'q': " + matcher1.group());
        }

        // Match words ending with 'g'
        Pattern pattern2 = Pattern.compile("\\w+g\\b");
        Matcher matcher2 = pattern2.matcher(text);
        if (matcher2.find()) {
            System.out.println("Word ending with 'g': " + matcher2.group());
        }
    }
}

In this example, we're using \b to match word boundaries, \w+ to match one or more word characters, and combining them with 'q' and 'g' to find words starting with 'q' and ending with 'g' respectively.

Capturing Groups in Regular Expression

Capturing groups allow you to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside parentheses.

public class CapturingGroupExample {
    public static void main(String[] args) {
        String text = "John Doe ([email protected])";
        Pattern pattern = Pattern.compile("(\\w+)\\s(\\w+)\\s\\((\\w+@\\w+\\.\\w+)\\)");
        Matcher matcher = pattern.matcher(text);

        if (matcher.find()) {
            System.out.println("First Name: " + matcher.group(1));
            System.out.println("Last Name: " + matcher.group(2));
            System.out.println("Email: " + matcher.group(3));
        }
    }
}

In this example, we're extracting a person's first name, last name, and email address from a string. The parentheses in the pattern create capturing groups, which we can access using matcher.group(n).

Regular Expression - Matcher Class Methods

The Matcher class provides several useful methods. Here are some of the most commonly used ones:

Method Description
find() Finds the next match for the pattern
group() Returns the matched substring
start() Returns the starting index of the match
end() Returns the ending index of the match
matches() Checks if the entire string matches the pattern

Let's see these methods in action:

public class MatcherMethodsExample {
    public static void main(String[] args) {
        String text = "The rain in Spain falls mainly on the plain";
        Pattern pattern = Pattern.compile("\\b\\w+ain\\b");
        Matcher matcher = pattern.matcher(text);

        while (matcher.find()) {
            System.out.println("Found: " + matcher.group());
            System.out.println("Start index: " + matcher.start());
            System.out.println("End index: " + matcher.end());
        }
    }
}

This code finds all words ending with "ain" and prints each match along with its start and end indices.

Regular Expression - Replacement Methods

Regular expressions aren't just for finding patterns; they're also great for replacing text. The Matcher class provides methods for this purpose:

public class ReplacementExample {
    public static void main(String[] args) {
        String text = "The quick brown fox jumps over the lazy dog";
        Pattern pattern = Pattern.compile("fox|dog");
        Matcher matcher = pattern.matcher(text);

        String result = matcher.replaceAll("animal");
        System.out.println("After replacement: " + result);
    }
}

In this example, we're replacing both "fox" and "dog" with "animal". The replaceAll() method does all the hard work for us!

Regular Expression - PatternSyntaxException Class

Sometimes, we might make mistakes while writing our regular expressions. That's where the PatternSyntaxException class comes in handy. It's thrown to indicate a syntax error in a regular expression pattern.

public class PatternSyntaxExceptionExample {
    public static void main(String[] args) {
        try {
            Pattern.compile("[");  // Invalid regex
        } catch (PatternSyntaxException e) {
            System.out.println("Pattern syntax exception: " + e.getMessage());
            System.out.println("Index of error: " + e.getIndex());
        }
    }
}

This code deliberately uses an invalid regex to demonstrate how PatternSyntaxException works. It's like having a built-in proofreader for your regular expressions!

And there you have it, folks! We've journeyed through the land of Java Regular Expressions, from basic patterns to complex replacements. Remember, like any powerful tool, regex becomes more useful the more you practice with it. So don't be afraid to experiment and create your own patterns. Who knows? You might just become the Sherlock Holmes of pattern matching!

Happy coding, and may your strings always match your expectations!

Credits: Image by storyset