确定类型#
定制 collector 要实现 Collector 接口,首先要确定类型
- 待收集元素的类型
- 累加器 /accumulate 的类型
- 最终结果的类型
假设要实现这么个收集器:
public class GroupingBy<T,K> implements Collector<T,Map<K,List<T>>,Map<K,List<T>>>
类型分别为:
- T
- Map<K,List>
- Map<K,List>
实现收集器的组件#
收集器有 4 个重要的组件,他们都是函数
- supplier
- accumulator
- combiner
- finisher
supplier#
supplier 用于创建容器.
@Override
public Supplier<Map<K, List<T>>> supplier() {
return ()-> new HashMap<>();
}
accumulator 是叠加器,相当于 reduce 里面的第二个参数,用于将下一个内容加入到前面的结果.
@Override
public BiConsumer<Map<K, List<T>>, T> accumulator() {
return (accumulator,ele)->{
K key = this.classifier.apply(ele);
List<T> tList = accumulator.get(key);
if (tList == null){
tList = new ArrayList<>();
}
tList.add(ele);
accumulator.put(key,tList);
};
}
在添加下一个元素之前判断 map 中有无 list
关键的一点是 key 的获取。由传进来的一个 classifier 完成,通过 classifier 获得 key.
combiner#
相当于 reduce 的参数 3, 用于将产生的各个容器合并起来
@Override
public BinaryOperator<Map<K, List<T>>> combiner() {
return (l,r)->{
l.putAll(r);
return l;
};
}
直接把后一个装到前一个并返回就行
finisher#
描述返回最终的结果.
@Override
public Function<Map<K, List<T>>, Map<K, List<T>>> finisher() {
return accumulator->accumulator;
}
额外 characteristics#
描述数据的返回形式
@Override
public Set<Characteristics> characteristics() {
return Collections.unmodifiableSet(EnumSet.of(Characteristics.IDENTITY_FINISH));
}
相关解释:
/**
* Characteristics indicating properties of a {@code Collector}, which can
* be used to optimize reduction implementations.
*/
enum Characteristics {
/**
* Indicates that this collector is <em>concurrent</em>, meaning that
* the result container can support the accumulator function being
* called concurrently with the same result container from multiple
* threads.
*
* <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
* then it should only be evaluated concurrently if applied to an
* unordered data source.
*/
CONCURRENT,
/**
* Indicates that the collection operation does not commit to preserving
* the encounter order of input elements. (This might be true if the
* result container has no intrinsic order, such as a {@link Set}.)
*/
UNORDERED,
/**
* Indicates that the finisher function is the identity function and
* can be elided. If set, it must be the case that an unchecked cast
* from A to R will succeed.
*/
IDENTITY_FINISH
}