Autor: José C. Riquelme Santos

Catedrático de Universidad. Departamento de Lenguajes y Sistemas Informáticos. Universidad de Sevilla
ORCID iD iconorcid.org/0000-0002-8243-2186
JAVA 8: Methods of stream


1. filter(Predicate)
2. allMatch(Predicate)
3. anyMatch(Predicate)
4. min(Comparator)
5. max(Comparator)
7. reduce(T, BiFunction)
8. forEach(Consumer)
9. sorted(Comparator)
10. collect(Collector)
    10.1 Reduction to List or Set
    10.2 Statistics
    10.3 Reduction to Map
             10.3.1 groupingBy(Function) and partitioningBy(predicate)
              10.3.2 GroupingBy(Function, Collector)
              10.3.3 GroupingBy(Function, Supplier, Collector)



Introduction

Java 8 release represented a very deep change with respect to previous versions, and forced us to reconsider the way we will teach programming to our students. These manuals aim to sum up the main properties we found in a quick study on the new functionalities of the type Stream. All the code hereby presented has been written using Eclipse Version: Luna (4.4) Build id: I20140501-0200.

To give examples of the use cases we will use the type Flight, with the following interface:

public interface Flight  extends Comparable<Flight>{
       String getCode() ;
       String getDestination() ;
       LocalDate getDate();
       Integer getNumPassengers();
       Integer getNumSeats() ;
       Duration getDuration();
       Double getPrice() ;
       Double getOccupation(); //optional
       Double getEarnings (); //optional
       void setPrice(Double p);
       void setDuration(Duration d);
       void setNumPassengers(Integer np) ;
}

The type Duration implements the time of a Flight and has the following functionality:

public interface Duration extends Comparable<Duration>{
       Integer getMinutes();
       Integer getHours();
       Duration sum(Duration d);
}

Our objective is to implement a type Airport the will manipulate as attribute a collection of objects of type Flight that we called flights and that requires several supported operations, that will be introduced at each of the sections this text has been divided in, for clearer comprehension. The difficulty of the code is increasing and used types such as Predicate, Function, Supplier etc will not be explained in this manual, but in a separate, second one (click here).

In this document we will focus over the data structure Stream, which is a virtual iterable over objects. The method stream() applied to a Collection returns a Stream from the objects of that Collection. That is why the code chunks on the following sections will always start with flights.stream(). Stream’s main operations are described in the following sections.


1. filter(Predicate) 

It is used to obtain another Stream only with the elements of the invoking one that satisfy the given Predicate. It is usually combined with methods such as count, to know the amount of elements in a Collection satisfying a property. For instance, how many flights are there for a determined date:

public Long getNumFlightsDay(LocalDate f) {
   return flights.stream().
       filter(x->x.getDate().equals(f)).
       count();
}

Or how many full flights there is:

public Long getNumFullFlights() {
   return flights.stream().
       filter(x->x.getNumPassengers().equals(x.getNumSeats())).
       count();
}

If an object of type predicate is going to be called repeated times, it must first be defined and then called. For example, if the condition were whether or not a flight is at a certain date, it would be defined as:

Predicate<Flight> equalDate(LocalDate f){
    return x -> x.getDate().equals(f);
}

Ad would later be invoked:

public Long getNumFlightsDate(LocalDate f) {
   return flights.stream().
       filter(equalDate(f))
       count();
}

It can also be combined with method like findFirst, that returns the first object of a stream, or limit(long) that returns a stream limited to the number of elements specified as parameter.


2. allMatch(Predicate)

It returns true if all the elements of a stream satisfy the given condition. For instance, if we want to know if all the flights of a date are full:

public Boolean allFull(LocalDate f){
   return flights.stream().
       filter(equalDate(f)).
       allMatch(x->x.getNumPassengers().
                equals(x.getNumSeats()));
}

In this case the Predicate from filter could have been added to the one of the method allMatch with the “and” method from Predicate.

Predicate<Flight> flightFull = x->
       x.getNumPassengers().equals(x.getNumSeats());
 
public Boolean allFull2(LocalDate f){
    return flights.stream().
             allMatch(equalDate(f).and(flightFull()));
}     

We must take into account that this method allFull2 does not work the same way the first one does. The difference is on the fact that allFull first performs the filter operation, and if the result is an empty stream, allMatch will return true. On the second method, if no flight satisfies the condition, the result will be false.


3. anyMatch(Predicate)

It returns true if at least one of the elements of the stream satisfies the given Predicate. For instance, if we want to know if there is any flight towards a specific destination and at a specific date:

public Boolean existsFlightDestinationDate(LocalDate f, String d){
   return flights.stream().
       anyMatch(x->x.getDate().equals(f) &&
             x.getDestination().equals(d));
}

Or to check if there is some non-full flight a certain date, assuming we already have the predicates equalDate y flightFull:

public Boolean existsFlightDateNotFull(LocalDate f){
       return flights.stream().
             anyMatch(equalDate(f).
             and(flightFull().negate()));
}

 

4. min(Comparator)

It returns the minimal element of a Collection according to the order established by a Comparator. To define the Comparator of a class by any of its attributes it is very useful to use the method comparing from the class Comparator. For instance, the cheaper flight to a certain destination is calculated comparing two objects of type Flight by their Price using the Function Flight::getPrice:

public Flight getCheaperFlightDestination(String d){
       return flights.stream().
             filter(x->x.getDestination().equals(d)).
             min(Comparator.comparing(Flight::getPrice)).
             get();
}

The Comparator class allows us to specify the type we are comparing. This way, on the previous example the method comparingDouble should have been used, since that is the type returned by getPrice. Also, the methods comparingInt and comparintLong are available for that matter. We could as well use the natural order of Flight (given by its compareTo) with the call:

    min(Comparator.naturalOrder());
 

We must take into account that the min method returns what in Java 8 is called an optional type, because it could not exist: consider the case of calculating the minimal element of an empty collection, which could happen if there were no flights for the given destination. That is why the get method is invoked on the last step. It returns the element if it exists and throws NoSuchElementException otherwise. This possibility can be controlled with other methods available after calling min, like for instance the method ifPresent, that returns true if there is indeed a value on the optional object, or the method orElse, that allows us to create and return an object if the optional is empty (which would mean no minimal element was found).

 
5. max(Comparator)

Similar to the previous method but with the maximum instead of the minimum. For instance, if we wanted to return the flight with the higher occupation for a given day, we would implement the method:

public Flight getFlightHigherOccupation(LocalDate f) {
    return flights.stream().
       filter(x->x.getDate().equals(f)).
       max(Comparator.comparingDouble(
             x->x.hetNumPassengers()/x.getNumSeats())).
       get();
}


On the previous case the FunctionDouble was defined as a parameter of the method comparingDouble. Another different solution would be possible if Flight had a method to return its occupation. In that case the code would be:


public Flight getFlightHigherOccupation(LocalDate f) {
    return flights.stream().
       filter(x->x.getDate().equals(f)).
       max(Comparator.comparingDouble(Flight::getOccupation))
       get();
}

Logically if the type Flight did not have that functionality we could define a Function for it:

Function<Flight, Double> getOccupation = x->
                                  1.*x.getNumPassengers()/x.getNumSeats();


Or a ToDoubleFunction:

ToDoubleFunction<Flight> getOccupation = x->
                                  1.*x.getNumPassengers()/x.getNumSeats();


Being our method

public Flight getFlightHigherOccupation(LocalDate f) {
  return flights.stream().
       filter(x->x.getDate().equals(f)).
       max(Comparator.comparingDouble(getOccupation))
       get();
} //getOccupation refers to the one defined as ToDoubleFunction
// to use getOccupation as a Function we should invoke ir from comparing


 
6. map(Function)

This is actually a method family, with its adaptations mapToInt (ToIntFunction), mapToLong (ToLongFunction) and mapToDouble (ToDoubleFunction) returning a Stream of objects of other type obtained from the base type applying a Function. If the returned type is Integer, Long or Double methods with the corresponding functions as argument are available. This way, if on the previous method we are not interested on the higher occupation flight but directly on its occupation, we could implement the method as:

public Double getHigherOccupation(LocalDate f) {
    return flights.stream().
       filter(x->x.getDate().equals(f)).
       mapToDouble(x->x.getNumPassengers()/x.getNumSeats()).
       max().
       getAsDouble();
}

Or using the previous ToDoubleFunction:

public Double getHigherOccupation(LocalDate f) {
    return flights.stream().
       filter(x->x.getDate().equals(f)).
       mapToDouble(getOccupation()).
       max().getAsDouble();
}

The map methods will also ease a series of operations over the returned values. For example, if we wanted to know the total billing of flights for a certain destination, we would write:

public Double sumBillingDestination(String d) {
   return flights.stream().
       filter(x->x.getDestination().equals(d)).
       mapToDouble(x->x.getNumPassengers()*x.getPrice()).
       sum();
}

Instead of obtaining the sum, we could find the average value with the method called average. For instance, defining a ToDoubleFunction:
 

ToDoubleFunction<Flight> getBilling = x->
                                        x.getNumPassengers()/x.getPrice();
 
The average billing of a certain day would be:


public Double averageBillingDate(LocalDate f) {
    return flights.stream().
             filter(equalDate(f)).
             mapToDouble(getBilling).
             average().
             getAsDouble;
}// average returns an OptionalDouble just in case there are no elements

 

7. reduce(T, BiFunction)

The reduce method produces a “reduction” of the stream according to the operation describe don the BiFunction, and using the first argument as the neutral element. The sum and average operations seen previously are particular cases of reduce of numerical types.  For other types created by the programmer, like Duration, the reduce method can be used to add the durations of the flights of a certain day:

 
We assume the Duration type has a sum method capable of returning the addition of two durations.
 

public Duration sum(Duration d) {
       Integer min = getMinutes() + d.getMinutes();
       Integer hor = getHours() + d.getHours();
       return new DurationImpl(hor+min/60,min%60);
}

 
The reduce method is invoked with two arguments: the first one is the neutral elemnt that initializes the accumulator and the second one is the operation that accumulates:


public Duration getDurationFlightsDate(LocalDate f){
    return flights.stream().
       filter(x->x.getDate().equals(f)).
       map(Flight::getDuration).
       reduce(new DurationImpl(0,0), Duration::sum);        
}


Another way of doing that is defining the BiFunction in the code. In this case, since the three types that take part are of type Duration, this is a particular case of a BinaryOperator that could be defined as:

 
       BinaryOperator<Duration> sumdur = (x,y)-> x.sum(y);

 
Even if the Duration type did not implment the sum functionality, it could be done in the code:


BinaryOperator<Duration> sumdur = (x,y)-> {
       Integer min = x.getMinutes() + y.getMinutes();
       Integer hor = x.getHours() + y.getHours();
       return new DurationImpl(hor+min/60,min%60);
};

 
This way, the call to the reduce method would be:
 

       reduce(new DurationImpl(0,0), sumdur);

 

8. forEach(Consumer)

The forEach method is used to perform an action defined on the Consumer object passed as argument over all the elements of the stream. For instance, to change the duration of the flights to a certain destination for a given number of minutes, a method like the following one would be used:

public void incrementDurationDestination(String d, Integer min){
       flights.stream().
       filter(x->x.getDestination().equals(d)).
       forEach(x->x.setDuration(x.getDuration().
       sum(new DurationImpl(0,min))));
}

Or to increment the prices of the flights later than certain day by 10%:

Consumer<Flight> incrementPrice10p =  x->x.setPrice(x.getPrice()*1.1);
      
public void incrementPrices10pfromDate(LocalDate f) {
       flights.stream().
       filter(x->x.getDate().compareTo(f)>0).
       forEach(incrementPrice10p);
}

It could also be used to implement a static method that would write on a text file the elements of a stream, one for each line:

public static <T> void writeFile(Stream<T> it, String filename){
   File file = new File(filename);
   try {
       PrintWriter ps = new PrintWriter(file);
       it.forEach(x->ps.println(x));
       ps.close();
  } catch (FileNotFoundException e) {
       System.out.println("File not found "+filename);
  }
}

 

9. sorted(Comparator)

The sorted method returns the stream ordered according to the Comparator given as a parameter. If no order is specified the natural one is used. For instance, reusing the previous static method for writing a stream on a file, we could implement the following method that writes a stream ordered by Date and Duration.

public void writeFlightsOrderedDateDuration (String fileName){
       Util.writeFile(flights.stream().
             sorted(Comparator.comparing(Flight::getDate).
             thenComparing(Flight::getDuration)), fileName);
}

 

10. collect(Collector)

The collect functionality is a very powerful tool to reduce a stream into another collection, Map structure or data. Some of those operations can be carried out by previously seen methods like reduce. Since there are a big number of possibilities, we will divide them according to their objective.


10.1 Reduction to List or Set. The collect method provides a very powerful tool to transform a stream into another data collection like a List or a Set. For instance, if we wanted to return a a list with the durations of the flights to a certain destination, we would write:

public List<Duration> getDurationsDestination(String d){
    return flights.stream().
       filter(x->x.getDestination().equals(d)).
       map(x->x.getDuration()).
       collect(Collectors.toList());
}

The Collectors class provides the methods toList and toSet, which means that the stream of durations obtained by the map operation can be converted into a list or a set of objects respectively. For instance, the set of the possible destinations from an airport is given by the method:

public Set<String> getDestinations(){
    return flights.stream().
       map(x->x.getDestination()).
       collect(Collectors.toSet());
}

The toList and toSet methods are particular cases of the toCollection method that receives a Supplier with, for instance, an invocation to the constructor of the type. This way we can return a SortedSet with the destinations of a certain day:

public SortedSet<String> getDestinationsInDate(LocalDate f){
   return flights.stream().
       filter(x->x.getDate().equals(f)).
       map(x->x.getDestination()).
       collect(Collectors.toCollection(TreeSet::new));
}

We can also transform the objects before collecting them. For example, if we wanted to obtain a list with the durations of the flights to a certain destination incremented by m minutes, the following method would be implemented:


public List<Duration> getIncrementedDurationsDestination(String d, Integer m){
   return flights.stream().
       filter(x->x.getDestination().equals(d)).
       map(x->x.getDuration().sum(new DurationImpl(0,m))).
       collect(Collectors.toList());
}

 
10.2 Statistics. If we only wanted to know the number of elements of the stream we can also use Collectors.counting(). Also, if the stream contains numerical data the collectors Collectors.summingDouble (summingInt or summingLong) can be used to return the sum of the stream with the same type, and Collectors.averagingDouble (averagingInt or averagingLong) that return the arithmetic average, always on Double type.

The type DoubleSummaryStatistics is also available, and it can be used to obtain the values of the sum, maximum, minimum and average of a series of numerical values. An object of this type is returned by Collectors.summarizingDouble which receives a ToDoubleFunction (summarizingInt and summarizinLong are also available). For instance, to obtain the average of the prices of the flights to a certain destination, we would write:

public Double getAveragePricesDestination(String d){
   return flights.stream().
       filter(x->x.getDestination().equals(d)).
       collect(Collectors.summarizingDouble(Flight::getPrice)).
       getAverage();
}

Knowing that the variance of a simple is the average of the squares of the values minus the square of the average, we could obtain the variance of all the prices of the airport with the following method:

public Double getPriceVariance(){
    Double sqavg = flights.stream().
             collect(Collectors.summarizingDouble(
                    x->x.getPrice()*x.getPrice())).
             getAverage();
   Double average = flights.stream().
             collect(Collectors.summarizingDouble(Flight::getPrice)).
             getAverage();
   return sqavg-average*average;
}

When the usual arithmetical operations of average and sum are not available because the base type is not a numerical data type (Double, Integer, etc), the reducing functionality of Collectors must be used, to implement new operations over this base type. For instance, if we wanted to get the sum of the durations of all flights to a certain destination incremented by m:
 

public Duration incrementAndAddDestinationDuration(String d, Integer m){
   return flights.stream().
       filter(x->x.getDestination().equals(d)).
       map(x->x.getDuration().sum(new DurationImpl(0,m))).
       collect(Collectors.reducing(
       new DurationImpl(0,0), Duration::sum));
}

 
10.3 Reduction to Map. The Collectors class provides the methods partitioningBy and groupingBy to organize the information of a stream on a Map. There is a big number of possibilities, since groupingBy can receive different parameters.

 

10.3.1 groupingBy(Function) and partitioningBy(predicate). In this first case, the functionality groupingBy receives a Function, whereas partitioningBy is a particular case in which the Function is substituted by a Predicate. In this case, the final set is a List with the objects of the stream. Let us see different uses of this: if we wanted to build a Map<Boolean, List<Flight>> to separate the flights of the airport depending on whether they are complete or not, we would write (using the Predicate at point 2):

public Map<Boolean, List<Flight>> getMapFullFlight(){
   return flights.stream().
       collect(Collectors.partitioningBy(flightFull()));    
}

 
A more complex case would be wanting to separate the flights on a Map<Boolean,List<Flight>> according to whether the flights have passed a given threshold of occupation or not:

 
public Map<Boolean, List<Flight>> getMapFlightsPercentage(Double p){
    return flights.stream().
       collect(Collectors.partitioningBy(
       x->x.getOccupation().compareTo(p/100)>=0));    
}


To obtain a Map with more than two values on the original set we must use groupingBy. This way, to obtain a map that organizes flights by date, we would write:
 

public SortedMap<Date,List<Flight>> getMapFlightsByDate(){
    return flights.stream().
       collect(Collectors.groupingBy(Flight::getDate));
}

 

10.3.2 GroupingBy(Function, Collector). Of course, the final set of elements for the Map does not necessarily have to be a List, of objects of the stream, but it can also be a Set if, besides the Function, we give the groupingBy method a Collector for the final set:
 

public Map<String, Set<Flight>> getMapFlightSetByDestination(){
   return flights.stream().
       collect(Collectors.groupingBy(
               Flight::getDestination,
              Collectors.toSet()));
}

If the information of the final set is a reduction or an operation over the elements of the stream, it is necessary to use other objects of type Collector as parameter for groupingBy. This way, using the Collectors methods counting, averaging or summing we can obtain a reduction of the type of the final set of the Map. This way, to obtain the number of flights per destination, the required method would be:

 

public Map<String, Long> getMapNumFlightsPerDestination(){
   return flights.stream().
       collect(Collectors.groupingBy(
              Flight::getDestination,                      
              Collectors.counting()));
}

 Also, if we want a Map with the average price for each destination:


public Map<String, Double> getMapAveragePricePerDestination(){
    return flights.stream().
       collect(Collectors.groupingBy(
             Flight::getDestination,
             Collectors.averagingDouble(
                           Flight::getPrice)));
}

Or a Map with the billing for each Date:

public Map<Date, Double> getMapBillingPerDate(){
   return flights.stream().
       collect(Collectors.groupingBy(
             Flight::getDate,
             Collectors.summingDouble(
             (Flight x)->x.getNumPassengers()*x.getPrice())));
}

Which would be easier using the ToDoubleFunction getBilling of point 6:

public Map<Date, Double> getMapBillingPerDate (){
   return flights.stream().
       collect(Collectors.groupingBy(
             Flight::getDate,
             Collectors.summingDouble(getBilling())));
}

Sometimes the type of the final set is required to be a transformation of the type of the stream. In this case, the Collector given as the second parameter of groupingBy is a Collectors.mapping that must receive a Function and another Collector. For instance, to build a Map that associatesdestinations with a list of prices for each destination we would write:

public Map<String, List<Double>> getMapPricesPerDestination(){
    return flights.stream().
       collect(Collectors.groupingBy(
             Flight::getDestination,
             Collectors.mapping(Flight::getPrice,
             Collectors.toList())));
}

 
 
10.3.3 GroupingBy(Function, Supplier, Collector). Providing a Supplier allows us to return a SortedMap as the result of groupingBy. This way, to obtain a SortedMap with a list of flights per Date, we would write:
 

public SortedMap<Date,List<Flight>> getSortedMapFlightsByDate(){
   return flights.stream().
       collect(Collectors.groupingBy(
             Flight::getDate,
             TreeMap::new,
             Collectors.toList()));
}

 
A more complex exercise is to invert a Map. This means, if we have a Map<K,V>, we want to obtain a Map<V,List<K>> or a Map<V,Set<K>> where the elements of the initial and final sets of the initial Map have exchanged their roles on the inverted one. Logically, since there can be more than one key per value, it is necessary to assign multiple elements of the final set to each of the elements of the final set in the inverted Map. For instance, we previously saw a Map<String, Long> with the number of flights per destination. If we wanted to know the destination with most flights, a solution would be to invert that map into a SortedMap<Long,List<String>> in such a way that the biggest key gives us the list with the destinations with a greater number of flights. In Java 7 this method invertMap could be written as a static method in this way:


public static <T,K> SortedMap<T, List<K>> invertMap(Map<K, T> m) {
    SortedMap<T, List<K>> res = new TreeMap<T,List<K>>();
    Set<K> sp = m.keySet();
    for(K elem: sp){
             T val =m.get(elem);
             if (res.containsKey(val)) {
                 res.get(val).add(elem);
             }
             else {
                 List<K> list = new LinkedList<K>();
                 lista.add(elem);
                  res.put(val, list);
             }
    }
    return res;
}     

In Java 8 the code would be more compact and, depending on the reader, easier or harder to understand. It would be all about converting the Map into a stream with the method entrySet that returns the set of couples key-value. Once we have the map converted into a stream of couples, we invoke collect grouping the couples by their values (making the values become the keys of the new map) and forcing the final set to have Lists of the original keys, using the necessary mapping. The TreeMap constructor allows us to have a SortedMap as an output:              

public static <K,T> SortedMap<K, List<T>> invertMapToList(Map<T, K> m){
   return m.entrySet().stream().
       collect(Collectors.groupingBy(
             Map.Entry<T,K>::getValue,
             TreeMap::new,
             Collectors.mapping(
                    Map.Entry<T,K>::getKey,
                    Collectors.toList())));
}

Changing the final toList() by a toSet() we would obtain the method invertMapToSet. An immediate application of this code structure for the particular case of obtaining the number of flights related to a set of destinations is the following method:

public SortedMap<Long, Set<String>> getMapDestinationsPerNumFlights(){
    return getMapNumFlightsPerDestination().
       entrySet().stream().
       Collect(Collectors.groupingBy(
             Map.Entry<String,Long>::getValue,
             TreeMap::new,                                                            
             Collectors.mapping(
                    Map.Entry<String,Long>::getKey,
                    Collectors.toSet())));
}

So, the question of which are the destinations with a greater number of flights is answered this way:

public Set<String> getDestinationsMostFlights(){
       SortedMap<Long,Set<String>> m = getMapDestinationsPerNumFlights());
       return m.get(m.lastKey());
}

Or using the static method invertMapToSet:

public Set<String> getDestinationsMostFlights(){
SortedMap<Long,Set<String>>m=
      Util.invertMapToSet(getMapNumFlightsPerDestination());
return m.get(m.lastKey());
}

A simpler version of this problem where we know that there is a single maximal element can be solved and makes unnecessary to return a Set with the possible values. For instance, if there was only a destination with the maximum number of flights a solution would consist on taking the Map that assigns each destination its number of flights, turning the entry set into a stream, and then use a comparator over the values to obtain the bigger one and returning the couple that corresponds to that maximum with get and the key of that couple with getKey:

public String getDestinationMostFlights(){
   return getMapNumFlightsPerDestination().
       entrySet().stream(). 
       max(Comparator.comparing(x->x.getValue())).
       get().
       getKey();
}

 

No hay comentarios:

Publicar un comentario