[JAVA] 스트림(Stream) 생성 / 중간연산 map,flatMap / 최종연산 reduce, collect / Optional (2)

DEV/JAVA

[JAVA] 스트림(Stream) 생성 / 중간연산 map,flatMap / 최종연산 reduce, collect / Optional (2)

Imvory 2024. 3. 26. 16:16

1. 스트림 만들기

컬렉션 : Stream<T> Collection.stream()
- ex : Stream<Integer> intStream = list.stream();
배열
- Stream<T> Stream.of(val)
- Stream<T> Arrays.stream(val)
  - ex 1 : Stream<String> strStream = Stream.of( "a","b","c" );
  - ex 2 : Stream<String> strStream = Arrays.stream(new String[]{ "a","b","c" });
특정 범위의 정수 : 지정된 범위의 연속된 정수를 스트림으로 생성해서 반환
- IntStream.range(int begin, int end) : end가 범위에 포함 X
- IntStream.rangeClosed(int begin, int end): end가 범위에 포함
  - ex 1 : IntStream intstr = IntStream.range(1,5); //1,2,3,4
  - ex 2 : IntStream intstr = IntStream.rangeClosed(1,5); //1,2,3,4,5
임의의 수 : Random클래스에 난수들로 이루어진 스트림을 반환하는 클래스가 존재
- 아래 메서드들에 매개변수를 주지 않으면, 스트림의 크기가 정해지지 않은 무한 스트림을 반환하므로 limit()도 같이 사용하여 스트림의 크기를 제한해야한다.
- IntStream ints()
- LongStream longs()
- DoubleStream doubles()

IntStream intStr = new Random().ints(); //무한스트림
intStr.limit(5).forEach(System.out::println); //5개 요소만 출력

IntStream intStr2 = new Random().ints(5); //유한 스트림 : 크기가 5인 난수 스트림 반환

//지정된 범위의 난수를 발생시키는 스트림을 반환하는 것도 가능
//단, end 는 범위에 포함되지 않음
IntStream intStr3 = new Random().ints(1, 10)

람다식 - iterate(), generate() : 람다식을 매개변수로 받아서, 람다식에 의해 계산되는 값들을 요소로 하는 무한 스트림을 생성
- static <T> Stream<T> iterate(T seed, UnaryOperator<T> f)
  - seed값으로 지정된 값부터 시작해서 람다식 f에 의해 계산된 결과를 다시 seed값으로 해서 계산을 반복
  - ex ) Stream<Integer> evenStream = Stream.iterate(0, n→n+2); //0,2,4,6, …
- static <T> Stream<T> generate(Supplier<T> s)
  - iterate()와 같이 람다식에 의해 계산되는 값을 요소로하는 무한 스트림을 생성해서 반환하지만, 다른점은 이전 결과를 이용해서 다음 요소를 계산하지않음
  - 매개변수 타입은 Supplier<T>이므로 매개변수가 없는 람다식만 허용됨
  - ex ) Stream<Double> randomStream = Stream.generate(Math::random);
- 두 메서드에 의해 생성된 스트림은 기본형 스트림 타입의 참조변수로 다룰 수 없다.
파일 : list()는 지정된 디렉토리에 있는 파일의 목록을 소스로하는 스트림을 생성해서 반환한다.
- Stream<Path> Files.list(Path dir)
빈 스트림 : 요소가 하나도 없는 비어있는 스트림을 생성할 수 도 있다.
- Stream emptyStream = Stream.empty();
두 스트림의 연결 : concat() 메서드를 사용하면 같은 타입의 두 스트림을 하나로 연결할 수 있다.
- Stream<String> strs = Stream.concat(strs1, strs2);

2. 스트림의 중간 연산

: 다양한 연산자 중 자세한 설명이 필요한 map, flatMap만 다룸

- map() : 스트림의 요소에 저장된 값 중에서 원하는 필드만 뽑아내거나 특정 형태로 변환해야 할때 사용

ex) File 스트림에서 파일의 이름만 뽑기

Stream<File> fileStream = Stream.of(new File("ex1.java"),new File("ex2.java"),
				new File("ex.bak"),new File("ex1.txt"));
									
Stream<String> fileNameStream = fileStream.map(File::getName);

* map은 중간연산이므로 연산결과는 스트림이다.

* 하나의 스트림에 여러번 적용할 수 있다.

- mapToInt(), mapToLong(), mapToDouble() : 스트림의 요소를 숫자로 변환하는 경우 기본형 스트림으로 변환하는 것이 더 유용할 수 있음.

* 기본형 스트림은 숫자를 다루는데 편리한 메서드를 제공

int sum() : 스트림 모든 요소의 총합
OptionalDouble average() : sum() / (double) count()
OptionalInt max() : 스트림 요소 중 가장 큰 값
OptionalInt min() :스트림 요소 중 가장 작은 값
해당 메서드들은 최종 연산이기 때문에 호출 후에 스트림이 닫힌다.

→ 하나의 스트림에서 sum() , average()를 연속으로 호출 할 수 없음. 그래서 제공하는 summaryStatistics() 메서드

IntSummaryStatistics stat = scoreStream.summaryStatistics();
long totalCount = stat.getCount();
long totalScore = stat.getSum();
double avgScore = stat.getAverage();
int min = stat.getMin();
int max = stat.getMax();

* IntStream을 Stream<T>로 변환 시 mapToObj() 사용

* IntStream을 Stream<Integer>로 변환 시 boxed() 사용

- flatMap() : Stream<T[]>를 Stream<T>로 변환

Stream<String[]> strArrStream = Stream.of(
					new String[]{"abc","def","ghi"},
					new String[]{"ABC","GHI","JKLMN"}
);

//flatMap
Stream<String> strStream = strArrStream.flatMap(Arrays::stream);

//일반 map()사용시 스트림의 스트림 형태로 변환 됨
Stream<Stream<String>> strStream2 = strArrStream.map(Arrays::stream);

3. Optional<T>와 OptionalInt

Optional<T>는 지네릭 클래스로 ‘T타입의 객체’를 감싸는 래퍼클래스. 그래서 Optional타입의 객체에는 모든 타입의 참조변수를 담을 수 있다.
최종연산의 결과를 Optional객체에 담아서 반환하는 것이다.

→ Optional에서 정의된 메서드를 통해서 널 체크를 위한 if문 없이도 NullPointerException이 발생하지 않는 간결하고 안전한 코드를 작성할 수 있게 해줌

* Optional객체 생성하기

of() 또는 ofNullable() 사용
참조변수의 값이 null일 가능성이 있으면 ofNullable() 사용 (NullPointerException 방지)
초기화 시 empty() 사용
객체의 값을 가져올 때는 get(), 값이 null일 것을 대비해서 orElse()로 대체할 값 지정 가능
orElse()의 변형으로 람다식을 지정할 수 있는 orElseGet(), 예외를 발생시키는 orElseThrow() 존재
isPresent()는 Optional 객체에 값이 null이면 false, 아니면 true 반환
ifPresent()는 값이 있으면 주어진 람다식을 실행, 없으면 아무 일도 하지않음

/* 생성 */
String str = "abc";
Optional<String> optVal = Optional.of(str);

/* Null일 경우 */
Optional<String> optVal = Optional.of(null); // NullPointerException 발생
Optional<String> optVal = Optional.ofNullable(null);

/* 초기화 */
Optional<String> optVal = null; //권장하지않음
Optional<String> optVal = Optional.empty();

/* 값 가져오기 */
String str1 = optVal.get(); //저장 값 반환. null이면 예외발생
String str2 = optVal.orElse(""); //값이 null일때, "" 반환

/* orElse() 변형 */
String str3 = optVal.orElseGet(String::new); //String 객체 생성
String str4 = optVal.orElseThrow(NullPointerExcetopn::new); //NPE 발생

/* isPresent(), ifPresent() */
//기존의 null 체크
if(str != null) {
	System.out.println(str);
}

//isPresent() 사용
if(Optional.ofNullable(str).isPresent()) {
	System.out.println(str);
}

//ifPresent() 사용
Optional.ofNullable(str).ifPresent(System.out::println);

* Stream클래스에 정의된 메서드 중에 Optional<T>를 반환하는 메서드

findAny()
findFirst()
max(Comparator)
min(Comparator)
reduce()

→ 기본형 스트림과 다르게, max()와 min()의 매개변수로 Comparator 보냄

* 기본형 스트림에는 Optional도 기본형을 값으로 하는 OptionalInt, OptionalLong, OptionalDouble을 반환한다.

아래는 InputStream에 정의된 메서드이다.

OptionalInt findAny()
OptionalInt findFirst()
OptionalInt reduce()
OptionalInt max()
OptionalInt min()
OptionalDouble average()

* 기본형 int의 기본값은 0이다. 아무런 값도 갖지 않는 OptionalInt에 저장되는 값도 0이다.

아래 두 객체는 같을까 ?

OptionalInt opt = OptionalInt.of(0);
OptionalInt opt2 = OptionalInt.empty();

→ 저장된 값이 없는 것과 0이 저장된 것은 isPresent로 구분이 가능하다.

opt.isPresent(); → true
opt2.isPresent(); → false
opt.getAsInt(); → 0
- OptionalInt[0]
opt2.getAsInt(); → NoSuchElementException 발생
- OptionalInt.empty
opt.equals(opt2); → false

Optional객체에 null을 저장하면 비어있는 것과 동일하게 취급한다.

Optional<String> opt = Optional.ofNullable(null);
Optional<String> opt2 = Optional.empty();

System.out.println(opt.equals(opt2)); //true

4. 스트림의 최종 연산

: 다양한 연산자 중 자세한 설명이 필요한 reduce, collect만 다룸

- reduce() : 스트림의 요소를 줄여나가면서 연산을 수행하고 최종결과를 반환. 처음 두 요소를 가지고 연산한 결과를 가지고 그 다음 요소와 연산한다. 이 과정에서 스트림 요소를 하나씩 소모하게 되며 모든 요소를 소모하게 되면 그 결과를 반환한다.

Optional<T> reduce(BinaryOperator<T> accumulator)
초기 값을 갖는 reduce()도 있음 : 초기값과 스트림의 첫 번째 요소로 연산 시작
- T reduce(T identity, BinaryOperator<T> accumulator)
- U reduce(U identity, BiFunction<U,T,U> accumulator, BinaryOperator<U> combiner)
  → combiner는 병렬 스트림에 의해 처리된 결과를 합칠 때 사용

ex) 최종연산의 count(), sum() 등은 다음과 같이 내부적으로 reduce()를 이용해 만들어진것.

//Stream<Integer>
int count = intStream.reduce(0, (a,b) -> a + 1);
int sum = intStream.reduce(0, (a,b) -> a + b);
int max = intStream.reduce(Integer.MIN_VALUE, (a,b) -> a>b ? a:b);
int min = intStream.reduce(Integer.MAX_VALUE, (a,b) -> a<b ? a:b);

/* max와 min의 경우 초기값이 필요 없음.
다음과 같이 매개변수가 하나짜리인 reduce를 사용하는 것이 나음.
변수 intStream의 타입이 기본형 IntStream인 경우 사용 */
OptionalInt max = intStream.reduce( (a,b) -> a > b ? a : b);
OptionalInt min = intStream.reduce( (a,b) -> a < b ? a : b);

//위 람다식을 메서드 참조로 변환
OptionalInt max = intStream.reduce(Integer::max);
OptionalInt min = intStream.reduce(Integer::min);

- collect() : 스트림의 요소를 수집하는 연산. 리듀싱(reduce())과 유사.

collect()가 스트림의 요소를 수집하려면 어떻게 수집할 것인가에 대한 방법이 정의되어있어야하는데 이것이 바로 컬렉터(collector)이다.

collect() : 스트림의 최종연산. 매개변수로 컬렉터를 필요로 함.
Collector : 인터페이스. 컬렉터는 이 인터페이스를 구현해야함.
Collectors : 클래스. static 메서드로 미리 작성된 컬렉터를 제공.

Object collect(Collector collector)
Object collect(Supplier supplier, BiConsumer accumulator, BiConsumer combiner)

* 스트림을 컬렉션과 배열로 변환

: 스트림의 모든 요소를 컬렉션에 수집하려면, Collectors 클래스의 toList() 같은 메서드를 사용하면된다.

List나 Set이 아닌 특정 컬렉션을 지정하려면 toCollection()에 해당 컬렉션의 생성자 참조를 매개변수로 넣는다.

- toList(), toSet(), toMap(), toCollection(), toArray()

//toList() 사용하여 수집
List<String> names = studentStream.map(Student::getName).collect(Collectors.toList());

//toCollection으로 ArrayList 수집
ArrayList<String> list = names.stream().collect(Collectors.toCollection(ArrayList::new));

//toMap() : key - 학번, value - 학생 객체 지정
Map<String,Student> map = studentStream.collect(Collectors.toMap(s->s.getHakbun(), s->s));

/* toArray() : 스트림에 저장된 요소 T[]타입의 배열로 반환
단, 해당 타입의 생성자 참조를 매개변수로 지정해줘야함. default : Object[] */
Student[] stuNames = studentStream.toArray(Student[]::new); //OK
Student[] stuNames = studentStream.toArray(); //Error
Object[] stuNames = studentStream.toArray(); //OK

* 통계

: 최종연산에서 제공하는 통계 정보를 collect()를 사용해서 구하기. 주로 groupingBy()와 함께 사용할 때 필요

- counting(), summingInt(), averagingInt(), maxBy(), minBy()

// 최종연산 count() 메서드
long count = studentStream.count();
// collect()에서 제공하는 counting()
long count = studentStream.collect(Collectors.counting());

* 리듀싱

: 리듀싱도 마찬가지로 collect()로 사용이 가능하다

- reducing()

IntStream intStream = new Random().ints(1,46).distinct().limit(6);

// 최종연산 reduce()
OptionalInt max = intStream.reduce(Integer::max);

// collect()의 reducing()
Optional<Integer> max = intStream.boxed().collect(Collectors.reducing(Integer::max));

* 문자열 결합

: 문자열 스트림의 모든 요소를 하나의 문자열로 연결해서 반환. 구분자, 접두사, 접미사 지정 가능

스트림의 요소가 String이나 StringBuffer 처럼 CharSequence의 자손인 경우에만 결합이 가능

- joining()

String studentNames = studentStream.map(Student::getName).collect(Collectors.joining());
String studentNames = studentStream.map(Student::getName).collect(Collectors.joining(","));
String studentNames = studentStream.map(Student::getName).collect(Collectors.joining(",","[","]"));

* 그룹화와 분할

: 그룹화는 스트림의 요소를 특정기준으로 그룹화하는 것을 의미. 분할은 스트림의 요소를 두 가지, 지정된 조건에 일치하는 그룹과 일치하지 않는 그룹으로의 분할을 의미

- groupingBy(), partitioningBy()

- groupingBy()는 스트림의 요소를 Function으로, partitioningBy()는 Predicate로 분류

- 스트림을 두 개의 그룹으로 나누어야한다면 partitioningBy(), 그 외는 groupingBy()

- 그룹화와 분할의 결과는 Map에 담겨 반환됨

ex ) Student.class : 학생 클래스 정의

class Student {
    String name;
    boolean isMale;
    int hak;
    int ban;
    int score;

    Student(String name, boolean isMale, int hak, int ban, int score) {
        this.name = name;
        this.isMale = isMale;
        this.hak = hak;
        this.ban = ban;
        this.score = score;
    }

    public String getName() {
        return name;
    }

    public boolean isMale() {
        return isMale;
    }

    public int getHak() {
        return hak;
    }

    public int getBan() {
        return ban;
    }

    public int getScore() {
        return score;
    }

    @Override
    public String toString() {
        return "Student{" +
                "name='" + name + '\'' +
                ", isMale=" + isMale +
                ", hak=" + hak +
                ", ban=" + ban +
                ", score=" + score +
                '}';
    }
}

partitioningBy()에 의한 분류

Student[] students = {
		new Student("김자바", false, 1,2,200),
        new Student("이자바", true, 2,2,100),
        new Student("박자바", false, 3,3,300),
        new Student("최자바", true, 1,4,200)
};



//1. 분할 : 성별로 기본 분할 후 각각 리스트에 담기
Map<Boolean, List<Student>> stuBySex = Stream.of(students).collect(partitioningBy(Student::isMale));

List<Student> maleStudent = stuBySex.get(true);
List<Student> femaleStudent = stuBySex.get(false);


//2. 분할 + 통계 : counting() 메서드로 남학생,여학생 수 구하기
Map<Boolean, Long> stuNumBySex = Stream.of(students).collect(partitioningBy(Student::isMale, counting()));
System.out.println("남학생 수 : " + stuNumBySex.get(true));
System.out.println("여학생 수 : " + stuNumBySex.get(false));

//3. 분할 + 통계 : maxBy() 메서드로 남학생, 여학생 각각 성적 1등 구하기
Map<Boolean, Optional<Student>> topScoreBySex = Stream.of(students).collect(partitioningBy(Student::isMale, maxBy(comparingInt(Student::getScore))));
System.out.println("남학생 1등 : " + topScoreBySex.get(true));
System.out.println("여학생 1등 : " + topScoreBySex.get(false));

//3-1. 같은 결과 Optional이 아닌 Student객체 얻고 싶을 때 collectingAndThen()과 Optional::get 사용
Map<Boolean, Student> topScoreBySex2 = Stream.of(students).collect(partitioningBy(Student::isMale, collectingAndThen(maxBy(comparingInt(Student::getScore)), Optional::get)));
System.out.println("남학생 1등 : " + topScoreBySex2.get(true));
System.out.println("여학생 1등 : " + topScoreBySex2.get(false));


/*4. 다중 분할
1) 성별로 분할 (남/여)
2) 성적으로 분할 (불합격/합격)
*/
Map<Boolean, Map<Boolean, List<Student>>> failedStuBySex = Stream.of(students).collect(partitioningBy(Student::isMale, // 1. 성별로 분할
        partitioningBy(s -> s.getScore() < 150))); //2. 성적으로 분할
List<Student> failedMaleStudent = failedStuBySex.get(true).get(true); //남자 불합격자
List<Student> failedFemaleStudent = failedStuBySex.get(false).get(true); //여자 불합격자

결과

남학생 수 : 2
여학생 수 : 2
남학생 1등 : Optional[Student{name='최자바', isMale=true, hak=1, ban=4, score=200}]
여학생 1등 : Optional[Student{name='박자바', isMale=false, hak=3, ban=3, score=300}]
남학생 1등 : Student{name='최자바', isMale=true, hak=1, ban=4, score=200}
여학생 1등 : Student{name='박자바', isMale=false, hak=3, ban=3, score=300}

groupingBy()에 의한 분류

//1. 학생을 반별로 그룹화
Map<Integer, List<Student>> stuByBan = Stream.of(students).collect(groupingBy(Student::getBan, toList())); //toList생략 가능

System.out.println("===반별 그룹화===");
for(List<Student> ban : stuByBan.values()) {
	for(Student s : ban) {
    	System.out.println(s);
    }
}

/*2. 다중 그룹화
    1) 학년별 그룹화
    2) 반별 그룹화
*/
Map<Integer, Map<Integer, List<Student>>> stuByHakAndBan = Stream.of(students).collect(groupingBy(Student::getHak,
        groupingBy(Student::getBan)));

System.out.println("===다중 그룹화 1.학년 / 2.반 ===");
for(Map<Integer, List<Student>> hak : stuByHakAndBan.values()) {
	for(List<Student> ban : hak.values()) {
    		for(Student s : ban) {
        		System.out.println(s);
        	}
    	}
}

결과

===반별 그룹화===
Student{name='김자바', isMale=false, hak=1, ban=2, score=200}
Student{name='이자바', isMale=true, hak=2, ban=2, score=100}
Student{name='박자바', isMale=false, hak=3, ban=3, score=300}
Student{name='최자바', isMale=true, hak=1, ban=4, score=200}
===다중 그룹화 1.학년 / 2.반 ===
Student{name='김자바', isMale=false, hak=1, ban=2, score=200}
Student{name='최자바', isMale=true, hak=1, ban=4, score=200}
Student{name='이자바', isMale=true, hak=2, ban=2, score=100}
Student{name='박자바', isMale=false, hak=3, ban=3, score=300}