Java - Regular Expressions
Welcome, aspiring Java programmers! Today, we're diving into the fascinating world of Regular Expressions (Regex) in Java. Don't worry if you're new to programming; I'll guide you through this journey step by step, just as I've done for countless students over my years of teaching. So, grab a cup of coffee, and let's embark on this exciting adventure together!
What are Regular Expressions?
Before we jump into the Java-specific implementation, let's understand what Regular Expressions are. Imagine you're a detective trying to find a specific pattern in a sea of text. That's exactly what Regex does – it's a powerful tool for pattern matching and manipulation of strings.
Java Regular Expressions (Regex) Classes
Java provides a package called java.util.regex
that contains several classes for working with regular expressions. The three main classes we'll focus on are:
- Pattern
- Matcher
- PatternSyntaxException
Let's explore each of these in detail.
Pattern Class
The Pattern class represents a compiled representation of a regular expression. Think of it as the blueprint for your pattern-matching detective work.
import java.util.regex.Pattern;
public class PatternExample {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("Hello");
System.out.println("Pattern created: " + pattern);
}
}
In this example, we're creating a simple pattern that matches the word "Hello". The compile()
method takes the regular expression as a string and returns a Pattern object.
Matcher Class
The Matcher class is where the real magic happens. It performs match operations on a character sequence based on a Pattern.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MatcherExample {
public static void main(String[] args) {
String text = "Hello, World! Hello, Java!";
Pattern pattern = Pattern.compile("Hello");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println("Found match at index: " + matcher.start());
}
}
}
This code searches for the pattern "Hello" in the given text and prints the starting index of each match. It's like our detective finding all occurrences of a clue in a document!
Regular Expression Syntax
Now, let's learn some basic syntax for creating more complex patterns. Here's a table of commonly used metacharacters:
Metacharacter | Description |
---|---|
. | Matches any single character |
^ | Matches the beginning of the line |
$ | Matches the end of the line |
* | Matches zero or more occurrences |
+ | Matches one or more occurrences |
? | Matches zero or one occurrence |
\d | Matches a digit |
\s | Matches a whitespace character |
\w | Matches a word character (letter, digit, or underscore) |
Let's see some of these in action:
public class RegexSyntaxExample {
public static void main(String[] args) {
String text = "The quick brown fox jumps over the lazy dog";
// Match words starting with 'q'
Pattern pattern1 = Pattern.compile("\\bq\\w+");
Matcher matcher1 = pattern1.matcher(text);
if (matcher1.find()) {
System.out.println("Word starting with 'q': " + matcher1.group());
}
// Match words ending with 'g'
Pattern pattern2 = Pattern.compile("\\w+g\\b");
Matcher matcher2 = pattern2.matcher(text);
if (matcher2.find()) {
System.out.println("Word ending with 'g': " + matcher2.group());
}
}
}
In this example, we're using \b
to match word boundaries, \w+
to match one or more word characters, and combining them with 'q' and 'g' to find words starting with 'q' and ending with 'g' respectively.
Capturing Groups in Regular Expression
Capturing groups allow you to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside parentheses.
public class CapturingGroupExample {
public static void main(String[] args) {
String text = "John Doe ([email protected])";
Pattern pattern = Pattern.compile("(\\w+)\\s(\\w+)\\s\\((\\w+@\\w+\\.\\w+)\\)");
Matcher matcher = pattern.matcher(text);
if (matcher.find()) {
System.out.println("First Name: " + matcher.group(1));
System.out.println("Last Name: " + matcher.group(2));
System.out.println("Email: " + matcher.group(3));
}
}
}
In this example, we're extracting a person's first name, last name, and email address from a string. The parentheses in the pattern create capturing groups, which we can access using matcher.group(n)
.
Regular Expression - Matcher Class Methods
The Matcher class provides several useful methods. Here are some of the most commonly used ones:
Method | Description |
---|---|
find() | Finds the next match for the pattern |
group() | Returns the matched substring |
start() | Returns the starting index of the match |
end() | Returns the ending index of the match |
matches() | Checks if the entire string matches the pattern |
Let's see these methods in action:
public class MatcherMethodsExample {
public static void main(String[] args) {
String text = "The rain in Spain falls mainly on the plain";
Pattern pattern = Pattern.compile("\\b\\w+ain\\b");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println("Found: " + matcher.group());
System.out.println("Start index: " + matcher.start());
System.out.println("End index: " + matcher.end());
}
}
}
This code finds all words ending with "ain" and prints each match along with its start and end indices.
Regular Expression - Replacement Methods
Regular expressions aren't just for finding patterns; they're also great for replacing text. The Matcher class provides methods for this purpose:
public class ReplacementExample {
public static void main(String[] args) {
String text = "The quick brown fox jumps over the lazy dog";
Pattern pattern = Pattern.compile("fox|dog");
Matcher matcher = pattern.matcher(text);
String result = matcher.replaceAll("animal");
System.out.println("After replacement: " + result);
}
}
In this example, we're replacing both "fox" and "dog" with "animal". The replaceAll()
method does all the hard work for us!
Regular Expression - PatternSyntaxException Class
Sometimes, we might make mistakes while writing our regular expressions. That's where the PatternSyntaxException class comes in handy. It's thrown to indicate a syntax error in a regular expression pattern.
public class PatternSyntaxExceptionExample {
public static void main(String[] args) {
try {
Pattern.compile("["); // Invalid regex
} catch (PatternSyntaxException e) {
System.out.println("Pattern syntax exception: " + e.getMessage());
System.out.println("Index of error: " + e.getIndex());
}
}
}
This code deliberately uses an invalid regex to demonstrate how PatternSyntaxException works. It's like having a built-in proofreader for your regular expressions!
And there you have it, folks! We've journeyed through the land of Java Regular Expressions, from basic patterns to complex replacements. Remember, like any powerful tool, regex becomes more useful the more you practice with it. So don't be afraid to experiment and create your own patterns. Who knows? You might just become the Sherlock Holmes of pattern matching!
Happy coding, and may your strings always match your expectations!
Credits: Image by storyset