Parallel Streams are greatest addition to Java 8 after Lambdas. The actual essence of Stream API can only be observed if used as parallel.
Parallel Streams In Java 8 :
Suppose let’s take a scenario of you having a list of employee objects and you have to count employees whose salary is above 15000. Generally, to solve this problem you will iterate over list going through each and every employee and checking if employees salary is above 15000. This takes O(N) time since you go sequentially.
Streams provide us with the flexibility to iterate over the list in a parallel pattern and can give the aggregate in quick fashion.
Stream implementation in Java is by default sequential unless until it is explicitly mentioned for parallel. When a stream executes in parallel, the Java runtime partitions the stream into multiple substreams. Aggregate operations iterate over and process these sub-streams in parallel and then combine the results.
The only thing to keep in mind to create parallel stream is to call parallelStream() on the collection else by default sequential stream gets returned by stream().
Example on Parallel Streams :
We have created a list of 600 employees out of which there are 300 employees whose salary is above 15000.
Creating a sequential stream and filtering elements it took above 40 milliseconds, whereas the parallel stream only took 4 milliseconds.
import java.util.ArrayList;
import java.util.List;
public class ParallelStream {
public static void main(String[] args) {
List < Employee > empList = new ArrayList < Employee > ();
for (int i = 0; i < 100; i++) {
empList.add(new Employee("A", 20000));
empList.add(new Employee("B", 3000));
empList.add(new Employee("C", 15002));
empList.add(new Employee("D", 7856));
empList.add(new Employee("E", 200));
empList.add(new Employee("F", 50000));
}
long t1 = System.currentTimeMillis();
System.out.println("Sequential Stream count: " + empList.stream().filter(e -> e.getSalary() > 15000).count());
long t2 = System.currentTimeMillis();
System.out.println("Sequential Stream Time taken:" + (t2 - t1));
t1 = System.currentTimeMillis();
System.out.println("Parallel Stream count: " + empList.parallelStream().filter(e -> e.getSalary() > 15000).count());
t2 = System.currentTimeMillis();
System.out.println("Parallel Stream Time taken:" + (t2 - t1));
}
}
Employee.java
class Employee {
private int salary;
private String name;
Employee(String name, int salary) {
this.name = name;
this.salary = salary;
}
public int getSalary() {
return salary;
}
public void setSalary(int salary) {
this.salary = salary;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
Sequential Stream count: 300
Sequential Stream Time taken:59
Parallel Stream count: 300
Parallel Stream Time taken:4
Performance Implications:
Parallel Stream has equal performance impacts as like its advantages. Since each substream is a single thread running and acting on the data, it has overhead compared to sequential stream. Inter-thread communication is dangerous and takes time for coordination
When to use Parallel Streams:
- They should be used when the output of the operation is not needed to be dependent on the order of elements present in source collection (on which stream is created).
- Parallel Streams can be used in case of aggregate functions.
- Iterate over large sized collections.
- If you have performance implications with sequential streams.
- If your environment is not multi threaded, because parallel stream creates thread and can effect new requests coming in.
Happy Learning 🙂