✨ Feature: dataFile 읽기 추가 #7

catturtle123 · 2025-08-17T13:30:13Z

📌 PR 개요

dataFile 읽기 추가

🔍 관련 이슈

Closes ✨ [Feature] 문서 조회 추가 #3

✨ 변경 사항

데이터 파일 읽기 추가
✨ feature: 데이터 조회 로직
🐛 fix: MB 변환 수정
🧪 test: 조회 테스트 추가

📸 스크린샷 / 동작 확인 결과 (선택)

필요하다면 이미지나 동영상 첨부.

✅ 체크리스트

코드 스타일을 준수했습니다.
불필요한 주석이나 콘솔 로그를 제거했습니다.
관련 테스트를 작성/수정했습니다.
로컬 환경에서 모든 기능이 정상 작동함을 확인했습니다.

🗒️ 추가 메모

기타 참고 사항을 적어주세요.

Summary by CodeRabbit

신기능
- 데이터 파일 목록 조회 API 추가: 커서 기반 페이지네이션, 각 항목에 제목·형식·최근 수정일·크기(MB)·태그·SHA-256 표시, 다음 페이지 여부 및 다음 커서 제공. 요청 파라미터 검증(최소/최대 제한) 적용.
성능 개선
- 연관 데이터 로딩 배치 크기 조정으로 목록 조회 시 DB 쿼리 수 감소 및 응답 성능 향상.
테스트
- 목록 조회 동작을 검증하는 통합 테스트 추가 및 테스트 환경 정리 로직 포함.
잡일(Chore)
- 테스트용 저장소 초기화 기능 추가.

coderabbitai · 2025-08-17T13:30:20Z

Walkthrough

커서 기반 페이지네이션으로 데이터 파일 목록 조회 기능을 추가했다. 컨트롤러 GET /api/v1/data-files, 서비스의 읽기 전용 페이징 로직, 리포지토리의 커서 쿼리, 상세/목록/태그 DTO, 테스트 정리 및 하이버네이트 batch fetch 설정이 포함된다.

Changes

Cohort / File(s)	Summary
Controller: 목록 조회 엔드포인트 추가 `src/main/kotlin/.../controller/DataFileController.kt`	GET /api/v1/data-files 엔드포인트 추가(cursor, take 유효성). 서비스(getDataFiles) 위임 후 ApiResponse 반환.
DTO: 상세/목록/태그 응답 추가 `src/main/kotlin/.../dto/DataFileResponseDTO.kt`	DataFileDetailResponseList, DataFileDetailResponse(+companion of), TagDTO 추가. 기존 DataFilePreviewResponse 유지.
Repository: 커서 페이징 및 태그 조회 `src/main/kotlin/.../repository/DataFileRepository.kt`, `src/main/kotlin/.../repository/DataFileTagRepository.kt`	커서 기반 Slice 조회 메서드(`findByIdGreaterThanOrderById(cursorId, pageable)`) 추가. 특정 DataFile의 연관 태그를 JOIN FETCH로 조회하는 `findTagsByDataFile` 추가.
Service: 목록 조회 로직 `src/main/kotlin/.../service/DataFileService.kt`	`@Transactional(readOnly = true)` `getDataFiles(cursor, take)` 추가: Slice 조회, 각 파일별 태그 조회 및 DTO 매핑, nextCursor/hasNext 계산 후 DataFileDetailResponseList 반환.
Config: JPA 성능 설정 `src/main/resources/application.yml`	`spring.jpa.properties.hibernate.default_batch_fetch_size: 500` 추가.
Tests / Test util `src/test/kotlin/.../service/DataFileServiceTest.kt`, `src/main/kotlin/.../global/storage/FakeS3Util.kt`	테스트용 clean() 추가 및 getDataFilesOK() 테스트 추가. FakeS3Util에 `clear()` 메서드 추가로 테스트용 스토어 초기화 가능.

Sequence Diagram(s)

sequenceDiagram
  participant C as Client
  participant Ctl as DataFileController
  participant Svc as DataFileService
  participant Repo as DataFileRepository
  participant TagRepo as DataFileTagRepository

  C->>Ctl: GET /api/v1/data-files?cursor&take
  Ctl->>Svc: getDataFiles(cursor, take)
  Svc->>Repo: findByIdGreaterThanOrderById(cursor, pageable)
  Repo-->>Svc: Slice<DataFile>
  loop each DataFile
    Svc->>TagRepo: findTagsByDataFile(dataFile)
    TagRepo-->>Svc: List<DataFileTag>
  end
  Svc-->>Ctl: DataFileDetailResponseList (data, cursor?, hasNext)
  Ctl-->>C: ApiResponse.ok(payload)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Assessment against linked issues

Objective	Addressed	Explanation
문서 리스트 조회 추가 (#3)	✅
엔티티 안에 있는 정보 모두 꺼내기 (#3)	✅

Assessment against linked issues: Out-of-scope changes

Code Change	Explanation
Hibernate default_batch_fetch_size 설정 추가 (`src/main/resources/application.yml`)	성능/설정 변경으로 목록 조회 기능 구현 요구사항(#3)와 직접적인 요구는 아님 — 관련성은 있지만 필수 요구사항에 포함되지 않음.

Poem

풀밭에 콩콩, 커서 타는 토끼예요 🐇
파일 줄지어, 태그도 한아름 주워요 🌿
슬라이스로 한입, 다음도 불러와 냠냠
API는 반짝, 테스트도 깔끔히 정리했지요 ✨
깡총, 목록은 준비됐어요 — 저장소로 출발!

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/#3-document-read

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 10

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9abde4a and 5b8c2bf.

📒 Files selected for processing (7)

src/main/kotlin/simplerag/ragback/domain/document/controller/DataFileController.kt (2 hunks)
src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt (1 hunks)
src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt (1 hunks)
src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt (1 hunks)
src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt (3 hunks)
src/main/resources/application.yml (1 hunks)
src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt (3 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (5)

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt (1)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFile.kt (1)

name (7-32)

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt (1)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFileTag.kt (1)

name (6-24)

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt (1)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFileTag.kt (1)

name (6-24)

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt (3)

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileRequestDTO.kt (2)

message (13-20)

min (7-11)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFileTag.kt (1)

name (6-24)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFile.kt (1)

name (7-32)

src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt (1)

src/main/kotlin/simplerag/ragback/global/util/FileConvertUtil.kt (1)

sha256Hex (13-16)

🔇 Additional comments (1)

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt (1)

21-29: 불변성 강화 및 MB 단위/반올림 정책 확인 필수

id는 응답에 반드시 존재하므로 var Long? → val Long으로 변경하고, 매핑 시 requireNotNull(dataFile.id) { … }로 명시적 널→비널 검증을 추가하세요.
sizeMB 계산 단위(SI MB(10⁶) vs 이진 MiB(2²⁰))와 소수점 반올림 정책(예: 소수점 둘째 자리 반올림)을 확정하고, DTO 생성 시 명시적으로 처리하시길 권장합니다.

예시 diff:

 data class DataFileDetailResponse(
-    var id: Long?,
+    val id: Long,
     val title: String,
     val type: String,
     val lastModified: LocalDateTime,
-    val sizeMB: Double,
+    val sizeMB: Double, // TODO: 단위 및 소수점 반올림 정책 확정

     val tags: List<TagDTO>,
     val sha256: String,
 ) {
     companion object {
         fun of(dataFile: DataFile, tags: List<TagDTO>): DataFileDetailResponse {
             return DataFileDetailResponse(
-                dataFile.id,
+                requireNotNull(dataFile.id) { "Persisted DataFile must have non-null id" },
                 dataFile.title,
                 dataFile.type,
                 dataFile.updatedAt,
-                dataFile.sizeBytes / (1024.0 * 1024.0),
+                // (A) 이진 단위(MiB)
+                dataFile.sizeBytes / (1024.0 * 1024.0),
+                // (B) SI 단위(MB)
+                // dataFile.sizeBytes / 1_000_000.0,
+                // → 소수점 둘째 자리 반올림 예시:
+                // BigDecimal.valueOf(…)
+                //     .setScale(2, RoundingMode.HALF_UP)
+                //     .toDouble(),

                 tags,
                 dataFile.sha256,
             )
         }
     }
 }

위 항목을 기반으로 단위 및 반올림 정책을 확정해 주세요.

coderabbitai · 2025-08-17T13:41:40Z

src/main/kotlin/simplerag/ragback/domain/document/controller/DataFileController.kt

+    @GetMapping
+    fun getDataFiles(
+        @RequestParam(name = "cursor") cursor: Long,
+        @RequestParam(name = "take") take: Int,
+    ): ApiResponse<DataFileDetailResponseList> {
+        val data = dataFileService.getDataFiles(cursor, take)
+        return ApiResponse.ok(data)
+    }


💡 Verification agent

🧩 Analysis chain

쿼리 파라미터의 사용성/안전성 개선: cursor 선택적, take 범위 제한 및 기본값 권장

첫 페이지 진입을 위해 cursor는 선택적이어야 합니다. 현재는 필수 Long이라 호출자가 의미 없는 0 등을 넣어야 합니다.

take에는 범위 제한(@Min/@max)과 기본값을 권장합니다(예: 1..100, 기본 20).

다음처럼 시그니처를 조정하고, 서비스에 null 커서 처리(= 최초 페이지)를 반영해 주세요:

- fun getDataFiles( - @RequestParam(name = "cursor") cursor: Long, - @RequestParam(name = "take") take: Int, - ): ApiResponse<DataFileDetailResponseList> { + fun getDataFiles( + @RequestParam(name = "cursor", required = false) cursor: Long?, + @RequestParam(name = "take", defaultValue = "20") take: Int, + ): ApiResponse<DataFileDetailResponseList> { val data = dataFileService.getDataFiles(cursor, take) return ApiResponse.ok(data) }

추가로, 상단 import에 다음을 포함해 take의 범위를 보장하세요(파일 상단 import 영역에 추가 필요):

jakarta.validation.constraints.Min

jakarta.validation.constraints.Max

예: @Min(1) @Max(100) take: Int

PR 설명에서 “Closes #4”는 “문서 삭제 기능” 이슈로 보입니다. 본 PR은 “dataFile 목록 조회” 기능이므로 잘못 연결된 것으로 판단됩니다. 이슈 링크 및 키워드를 재검토해 주세요. 필요하면 제가 설명/제목 초안까지 정리해 드리겠습니다.

🏁 Script executed:

#!/bin/bash # 검색: getDataFiles 메서드 정의 위치 확인 rg -nP "fun getDataFiles" -C3 --glob '*.kt'

Length of output: 2482

🏁 Script executed:

#!/bin/bash # DataFileController.kt 상단 import 확인 sed -n '1,50p' src/main/kotlin/simplerag/ragback/domain/document/controller/DataFileController.kt

Length of output: 1901

API 사용성 개선을 위한 파라미터 시그니처 및 PR 메타정보 수정 요청

아래 사항을 반영해 주세요:

• cursor를 선택 Optional(Long?)으로 변경하고 서비스 로직에서 null → 첫 페이지(0) 처리
• take에 기본값 설정(defaultValue = "20") 및 범위 검증(@min(1), @max(100)) 추가
• 변경된 시그니처에 맞춰 DataFileService.getDataFiles도 cursor: Long? 처리 로직 구현
• import jakarta.validation.constraints.Min 및 import jakarta.validation.constraints.Max 추가
• PR 설명의 “Closes #4”는 문서 삭제 이슈로 보이며, 본 PR은 목록 조회 기능이므로 이슈 키워드/링크 재검토

예시 diff:

- @GetMapping - fun getDataFiles( - @RequestParam(name = "cursor") cursor: Long, - @RequestParam(name = "take") take: Int, - ): ApiResponse<DataFileDetailResponseList> { + @GetMapping + fun getDataFiles( + @RequestParam(name = "cursor", required = false) cursor: Long?, + @RequestParam(name = "take", defaultValue = "20") @Min(1) @Max(100) take: Int, + ): ApiResponse<DataFileDetailResponseList> { val data = dataFileService.getDataFiles(cursor, take) return ApiResponse.ok(data) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@GetMapping

fun getDataFiles(

@RequestParam(name = "cursor") cursor: Long,

@RequestParam(name = "take") take: Int,

): ApiResponse<DataFileDetailResponseList> {

val data = dataFileService.getDataFiles(cursor, take)

return ApiResponse.ok(data)

}

@GetMapping

fun getDataFiles(

@RequestParam(name = "cursor", required = false) cursor: Long?,

@RequestParam(name = "take", defaultValue = "20") @Min(1) @Max(100) take: Int,

): ApiResponse<DataFileDetailResponseList> {

val data = dataFileService.getDataFiles(cursor, take)

return ApiResponse.ok(data)

}

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/controller/DataFileController.kt around lines 45-52, change the controller method signature so cursor is nullable (cursor: Long?) and optional, set take to have defaultValue = "20" and annotate with @Min(1) and @Max(100) (add imports jakarta.validation.constraints.Min and jakarta.validation.constraints.Max), update the body to convert a null cursor to 0 when calling the service, and update DataFileService.getDataFiles to accept cursor: Long? and handle null as first page (0) in its logic; also remove or correct the PR description that says “Closes #4” since that issue refers to deletion and this PR implements listing — update the issue keyword/link accordingly.

coderabbitai · 2025-08-17T13:41:40Z

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt

+data class DataFileDetailResponseList(
+    val dataFilePreviewResponseList: List<DataFileDetailResponse>,
+    val cursor: Long?,
+    val hasNext: Boolean,
+)


🛠️ Refactor suggestion

DTO 필드명 불일치: Detail 응답인데 필드명이 Preview 리스트를 가리킵니다

DataFileDetailResponseList의 리스트 필드명이 dataFilePreviewResponseList로 되어 있어 혼동을 유발합니다. 새 API 스키마이므로 지금 시점에 의미 있는 이름으로 정리하는 것이 좋습니다. 예: data 또는 items, 그리고 cursor는 의미상 nextCursor가 더 명확합니다.

아래와 같이 변경을 권장합니다(호출부/테스트도 함께 수정 필요):

-data class DataFileDetailResponseList( - val dataFilePreviewResponseList: List<DataFileDetailResponse>, - val cursor: Long?, - val hasNext: Boolean, -) +data class DataFileDetailResponseList( + val data: List<DataFileDetailResponse>, + val nextCursor: Long?, + val hasNext: Boolean, +)

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt around lines 15 to 19, the DTO field names are misleading: rename dataFilePreviewResponseList to a meaningful name such as items (or data) to reflect that this is a detail response list, and rename cursor to nextCursor for clarity; update the data class property list and cursor name, then update all callers, serializers, and tests to use the new property names to avoid compilation failures and ensure API/schema consistency.

coderabbitai · 2025-08-17T13:41:40Z

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt

 interface DataFileRepository : JpaRepository<DataFile, Long> {
    fun existsBySha256(sha256: String): Boolean
+
+    fun findByOrderById(pageable: Pageable): Slice<DataFile>


💡 Verification agent

🧩 Analysis chain

커서 기반 페이지네이션 로직 미반영: 현재 메서드는 cursor 파라미터를 전혀 사용하지 못합니다

findByOrderById(pageable)는 오프셋 기반 정렬만 수행하며, id > :cursor 같은 커서 필터가 없습니다. 컨트롤러/서비스가 커서 값을 받고 있음에도 실제 쿼리에 반영되지 않아 항상 처음부터 페이지가 조회될 가능성이 큽니다.

커서 기반으로 올바르게 동작하려면 다음과 같은 시그니처를 추가하고 서비스에서 이를 사용하세요:

interface DataFileRepository : JpaRepository<DataFile, Long> { fun existsBySha256(sha256: String): Boolean - fun findByOrderById(pageable: Pageable): Slice<DataFile> + // 커서보다 큰 id만 조회(오름차순). 첫 페이지의 경우 cursor를 0L로 간주하면 됩니다. + fun findByIdGreaterThanOrderById(id: Long, pageable: Pageable): Slice<DataFile> }

또는 내림차순 커서가 필요하면 findByIdLessThanOrderByIdDesc 변형을 고려하십시오.

서비스 구현에서 실제로 어떤 메서드를 호출하는지 확인해 주세요. 필요 시 제가 서비스/컨트롤러 수정 패치까지 함께 제안하겠습니다.

서비스/컨트롤러에서 실제로 어떤 메서드를 호출하는지 확인하기 위해 아래 스크립트를 실행해 주세요:

🏁 Script executed:

#!/bin/bash # DataFileRepository.findByOrderById 사용 위치 검색 rg -n --type=kt "findByOrderById" -C3

Length of output: 64

서비스/컨트롤러에서 실제로 어떤 메서드를 호출하는지 확인하기 위해 아래 스크립트를 실행해 주세요:

🏁 Script executed:

#!/bin/bash # DataFileRepository 사용 위치 검색 rg -n "DataFileRepository" -C3 # findByOrderById 호출 검색 (정의 파일 제외) rg -n "findByOrderById" -g '*.kt' -g '!DataFileRepository.kt' -C3

Length of output: 5654

커서 기반 페이지네이션 로직 적용 필요

현재 DataFileService.getDataFiles()에서 cursor를 PageRequest.of(cursor.toInt(), take)의 페이지 인덱스로 사용하고 있어, 실제로는 오프셋 기반 페이징만 수행됩니다. id > cursor 조건이 빠져 커서 기반 페이지네이션이 적용되지 않습니다.

수정이 필요한 위치:

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt:11

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt:82

제안 패치:

interface DataFileRepository : JpaRepository<DataFile, Long> { fun existsBySha256(sha256: String): Boolean - fun findByOrderById(pageable: Pageable): Slice<DataFile> + // cursor(id)보다 큰 항목만 오름차순으로 조회합니다. + fun findByIdGreaterThanOrderById(cursorId: Long, pageable: Pageable): Slice<DataFile> }

class DataFileService( /*…*/ ) { @Transactional(readOnly = true) fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList { - val dataSlice = dataFileRepository.findByOrderById(PageRequest.of(cursor.toInt(), take)) + // 첫 페이지는 cursor=0L로 간주하고, 이후에는 cursorId를 기준으로 조회 + val pageable = PageRequest.of(0, take) + val dataSlice = dataFileRepository.findByIdGreaterThanOrderById(cursor, pageable) val dataFileList: MutableList<DataFileDetailResponse> = ArrayList() dataSlice.forEach { dataFile -> /*…*/ }

내림차순 커서가 필요하다면 findByIdLessThanOrderByIdDesc 변형을 추가 고려해 주세요. 변경 후 컨트롤러/서비스에서도 cursor를 페이지 인덱스가 아닌 마지막 조회된 id로 올바르게 전달해야 합니다.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fun findByOrderById(pageable: Pageable): Slice<DataFile>

interface DataFileRepository : JpaRepository<DataFile, Long> {

fun existsBySha256(sha256: String): Boolean

// cursor(id)보다 큰 항목만 오름차순으로 조회합니다.

fun findByIdGreaterThanOrderById(cursorId: Long, pageable: Pageable): Slice<DataFile>

}

Suggested change

fun findByOrderById(pageable: Pageable): Slice<DataFile>

class DataFileService(

/*…*/

) {

@Transactional(readOnly = true)

fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList {

// 첫 페이지는 cursor=0L로 간주하고, 이후에는 cursorId를 기준으로 조회

val pageable = PageRequest.of(0, take)

val dataSlice = dataFileRepository.findByIdGreaterThanOrderById(cursor, pageable)

val dataFileList: MutableList<DataFileDetailResponse> = ArrayList()

dataSlice.forEach { dataFile -> /*…*/ }

// …

}

}

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt around line 11 and src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt around line 82, the code currently uses PageRequest.of(cursor.toInt(), take) treating cursor as a page index; change to cursor-based queries instead. Add a repository method such as findByIdGreaterThanOrderByIdAsc(id: Long, pageable: Pageable): Slice<DataFile> (or if you need descending cursor pagination add findByIdLessThanOrderByIdDesc), then in the service parse the cursor into a Long lastId (default 0 or null for start), build a Pageable with PageRequest.of(0, take, Sort.by("id").ascending()/descending()), and call the new repository method with lastId and that pageable; ensure the controller passes the last returned id as the next cursor rather than a page index.

coderabbitai · 2025-08-17T13:41:40Z

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt

+    @Query("""
+        SELECT DISTINCT t
+        FROM DataFileTag dft
+        JOIN dft.tag t
+        WHERE dft.dataFile = :dataFile
+    """)
+    fun findTagsByDataFile(@Param("dataFile") dataFile: DataFile): List<Tag>


🛠️ Refactor suggestion

N+1 위험: 파일별 태그 개별 조회 대신 배치 조회 메서드를 추가하세요

현재는 데이터파일 목록의 각 항목마다 findTagsByDataFile를 호출하게 되어 페이지 크기만큼 쿼리가 반복됩니다. 배치 조회로 쿼리 수를 1~2개로 줄일 수 있습니다.

아래와 같이 배치용 메서드를 추가하고 서비스에서 한 번에 조회 후 그룹핑하세요:

interface DataFileTagRepository : JpaRepository<DataFileTag, Long> { fun existsByDataFileIdAndTagId(dataFileId: Long, tagId: Long): Boolean @Query(""" SELECT DISTINCT t FROM DataFileTag dft JOIN dft.tag t WHERE dft.dataFile = :dataFile """) fun findTagsByDataFile(@Param("dataFile") dataFile: DataFile): List<Tag> + + // 한 번에 여러 파일의 태그를 패치 조인으로 가져오기 + @Query(""" + SELECT dft + FROM DataFileTag dft + JOIN FETCH dft.tag t + WHERE dft.dataFile IN :dataFiles + """) + fun findAllByDataFileInFetchTag(@Param("dataFiles") dataFiles: List<DataFile>): List<DataFileTag> }

서비스에서는 findAllByDataFileInFetchTag(files) 결과를 dataFile.id로 그룹핑하여 TagDTO 리스트를 만들면 쿼리 1회로 충분합니다. 이렇게 하면 default_batch_fetch_size에 의존하지 않고도 확실하게 N+1을 제거할 수 있습니다.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@Query("""

SELECT DISTINCT t

FROM DataFileTag dft

JOIN dft.tag t

WHERE dft.dataFile = :dataFile

""")

fun findTagsByDataFile(@Param("dataFile") dataFile: DataFile): List<Tag>

interface DataFileTagRepository : JpaRepository<DataFileTag, Long> {

fun existsByDataFileIdAndTagId(dataFileId: Long, tagId: Long): Boolean

@Query("""

SELECT DISTINCT t

FROM DataFileTag dft

JOIN dft.tag t

WHERE dft.dataFile = :dataFile

""")

fun findTagsByDataFile(@Param("dataFile") dataFile: DataFile): List<Tag>

// 한 번에 여러 파일의 태그를 패치 조인으로 가져오기

@Query("""

SELECT dft

FROM DataFileTag dft

JOIN FETCH dft.tag t

WHERE dft.dataFile IN :dataFiles

""")

fun findAllByDataFileInFetchTag(@Param("dataFiles") dataFiles: List<DataFile>): List<DataFileTag>

}

coderabbitai · 2025-08-17T13:41:41Z

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt

+    @Transactional(readOnly = true)
+    fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList {
+        val dataSlice = dataFileRepository.findByOrderById(PageRequest.of(cursor.toInt(), take))
+


🛠️ Refactor suggestion

⚠️ Potential issue

커서 기반 아님 — Page 인덱스를 커서로 오용 중 (용어/의도 불일치)

PageRequest.of(cursor.toInt(), take)는 “커서=페이지 인덱스”입니다. 커서 기반(keyset) 페이지네이션을 구현하려면 id > lastId 조건으로 정렬 후 take+1 패턴을 써야 합니다. 현재 PR 요약과도 불일치합니다.

개선 옵션:

옵션 A(권장): 커서=마지막 id로 해석하도록 keyset 페이지네이션으로 전환(리포지토리 메서드 추가 필요: findByIdGreaterThanOrderByIdAsc(lastId, pageable)).

옵션 B(완화): 이번 변경은 Page 기반으로 유지하되 파라미터명을 page로 변경하고 API/설명을 정정.

최소한의 안전장치(입력 검증/오버플로 완화) 적용 제안:

@Transactional(readOnly = true) -fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList { - val dataSlice = dataFileRepository.findByOrderById(PageRequest.of(cursor.toInt(), take)) +fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList { + require(take in 1..100) { "take must be between 1 and 100" } + require(cursor >= 0) { "cursor must be >= 0" } + // Page 기반 유지 시 오버플로 방지 + val pageIndex = cursor.coerceAtMost(Int.MAX_VALUE.toLong()).toInt() + val dataSlice = dataFileRepository.findByOrderById(PageRequest.of(pageIndex, take))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@Transactional(readOnly = true)

fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList {

val dataSlice = dataFileRepository.findByOrderById(PageRequest.of(cursor.toInt(), take))

@Transactional(readOnly = true)

fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList {

require(take in 1..100) { "take must be between 1 and 100" }

require(cursor >= 0) { "cursor must be >= 0" }

// Page 기반 유지 시 오버플로 방지

val pageIndex = cursor.coerceAtMost(Int.MAX_VALUE.toLong()).toInt()

val dataSlice = dataFileRepository.findByOrderById(PageRequest.of(pageIndex, take))

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt around lines 80-83, the current implementation misuses PageRequest.of(cursor.toInt(), take) treating a cursor as a page index; change to real keyset pagination: interpret cursor as lastId, call a repository method that returns Ordered results with id > lastId (e.g. add findByIdGreaterThanOrderByIdAsc(lastId, Pageable)), request size take+1 to detect a next cursor and return only take items while setting the next cursor to the last item’s id; alternatively, if you want to keep Page-based behavior, rename the parameter from cursor to page and update API/docs accordingly; also add minimal validation for negative/overflow cursor values before converting to int.

coderabbitai · 2025-08-17T13:41:41Z

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt

+        val dataFileList: MutableList<DataFileDetailResponse> = ArrayList()
+        dataSlice.forEach{ dataFile ->
+            val tags: List<Tag> = dataFileTagRepository.findTagsByDataFile(dataFile)
+
+            val tagDtos: List<TagDTO> = tags.map{tag -> TagDTO(tag.id, tag.name)}
+
+            dataFileList.add(DataFileDetailResponse.of(dataFile, tagDtos))
+        }


🧹 Nitpick (assertive)

🛠️ Refactor suggestion

N+1 쿼리 발생: 태그 개별 조회를 배치/조인으로 전환 권장

각 DataFile마다 findTagsByDataFile 호출은 목록 규모에 비례해 N+1 쿼리를 유발합니다. 배치 페치 설정이 있더라도 응답 시간과 커넥션 사용량에 불리합니다.

개선 방향:

리포지토리에 다건 조회 메서드 추가(예: findTagsByDataFileIds(ids): List<(dataFileId, Tag)>), 결과를 groupBy 후 매핑.

혹은 DataFileTag -> Tag를 join fetch로 한 번에 불러오는 쿼리 메서드(예: findWithTagsByIdGreaterThanOrderByIdAsc) 설계.

Slice/Page 한 번에 가져오고, 1회 추가 쿼리 내에서 모든 태그를 조회하는 1+1 패턴 유지.

원한다면 리포지토리/서비스 시그니처까지 포함한 리팩터링 패치를 제안해 드리겠습니다.

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt around lines 84-91, the code iterates dataSlice and calls dataFileTagRepository.findTagsByDataFile for each DataFile causing an N+1 query problem; change this to fetch tags for all DataFile ids in one call and then map them back: add a repository method that returns tags for a list of dataFileIds (or a join-fetch query returning pairs), call it once with dataSlice.map { it.id }, groupBy dataFileId to build TagDTO lists, and then construct DataFileDetailResponse objects from that grouped result to replace the per-item repository call.

coderabbitai · 2025-08-17T13:41:41Z

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt

+        var nextCursor: Long? = null;
+        if (dataFileList.isNotEmpty()) {
+            nextCursor = dataFileList[dataFileList.size - 1].id
+        }
+
+        return DataFileDetailResponseList(dataFileList, nextCursor, dataSlice.hasNext())
+    }


🧹 Nitpick (assertive)

Kotlin 스타일 및 간결화 제안: 세미콜론 제거, nextCursor 계산 단순화

가독성 개선과 관용구 활용을 권장합니다.

- var nextCursor: Long? = null; - if (dataFileList.isNotEmpty()) { - nextCursor = dataFileList[dataFileList.size - 1].id - } - - return DataFileDetailResponseList(dataFileList, nextCursor, dataSlice.hasNext()) + val nextCursor: Long? = dataFileList.lastOrNull()?.id + return DataFileDetailResponseList(dataFileList, nextCursor, dataSlice.hasNext())

또한 dataFileList 빌드는 forEach 대신 map을 사용하는 방식으로 간결화할 수 있습니다(태그 조회 최적화 작업 시 함께 적용 권장).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

var nextCursor: Long? = null;

if (dataFileList.isNotEmpty()) {

nextCursor = dataFileList[dataFileList.size - 1].id

}

return DataFileDetailResponseList(dataFileList, nextCursor, dataSlice.hasNext())

}

val nextCursor: Long? = dataFileList.lastOrNull()?.id

return DataFileDetailResponseList(dataFileList, nextCursor, dataSlice.hasNext())

}

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt around lines 93 to 99, remove unnecessary semicolons and simplify nextCursor calculation by using Kotlin idioms (e.g., val nextCursor = dataFileList.lastOrNull()?.id) instead of manual index checks; also refactor the dataFileList construction to use map instead of forEach to build the list in a single expression (this will make the code more concise and idiomatic and prepares it for later tag lookup optimizations).

coderabbitai · 2025-08-17T13:41:41Z

src/main/resources/application.yml

      hibernate:
        dialect: org.hibernate.dialect.PostgreSQLDialect
        format_sql: true
+        default_batch_fetch_size: 500


🧹 Nitpick (assertive)

default_batch_fetch_size=500은 과도할 수 있고, 현재 조회 방식에서는 효과가 제한적입니다

500은 DB/드라이버에 따라 너무 큰 IN 리스트를 유발해 성능이 악화될 수 있습니다. 일반적으로 50~200 사이가 안전한 편입니다.

또한 본 PR의 태그 조회는 개별 파일마다 별도 쿼리(findTagsByDataFile)를 호출하는 패턴이라, Hibernate의 batch fetch가 큰 도움을 주지 못합니다. 태그를 배치로 조회하도록 리포지토리를 확장하는 쪽이 효과적입니다(아래 다른 코멘트 참고).

권장:

우선 값을 보수적으로 낮추고(예: 100), 태그 배치 조회 최적화를 병행하세요.

다음 간단 변경으로 보수적으로 조정할 수 있습니다:

- default_batch_fetch_size: 500 + default_batch_fetch_size: 100

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

default_batch_fetch_size: 500

default_batch_fetch_size: 100

🤖 Prompt for AI Agents

In src/main/resources/application.yml around line 19, default_batch_fetch_size is set to 500 which is too large for this usage and provides limited benefit given per-file tag queries; change the value to a conservative default (e.g., 100) in this file and simultaneously add a follow-up task to implement repository-level batch tag fetching (a single query that returns tags for multiple data files) and then re-evaluate the optimal batch size after measuring performance.

src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt

coderabbitai · 2025-08-17T13:41:41Z

src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt

+    @Test
+    @DisplayName("데이터 조회가 잘 된다")
+    @Transactional
+    fun getDataFilesOK() {
+        // given
+        val bytes1 = "test1".toByteArray()
+        val sha1 = sha256Hex(bytes1)
+        val bytes2 = "test2".toByteArray()
+        val sha2 = sha256Hex(bytes2)
+        dataFileRepository.saveAll(
+            listOf(
+                DataFile(
+                title = "exists",
+                type = "text/plain",
+                sizeBytes = 0,
+                sha256 = sha1,
+                fileUrl = "fake://original/exists.txt",
+            ),
+                DataFile(
+                    title = "exists2",
+                    type = "text/pdf",
+                    sizeBytes = 0,
+                    sha256 = sha2,
+                    fileUrl = "fake://original/exists.txt",
+                )
+            )
+        )
+
+        val cursor = 0L
+        val take = 2
+
+        // when
+        val dataFiles = dataFileService.getDataFiles(cursor, take)
+
+        // then
+        val dataFileDetailResponse = dataFiles.dataFilePreviewResponseList[0]
+        assertEquals(dataFileDetailResponse.title, "exists")
+        assertEquals(dataFileDetailResponse.type, "text/plain")
+        assertEquals(dataFileDetailResponse.sizeMB, 0.0)
+        assertEquals(dataFileDetailResponse.sha256, sha1)
+
+        val dataFileDetailResponse2 = dataFiles.dataFilePreviewResponseList[1]
+        assertEquals(dataFileDetailResponse2.title, "exists2")
+        assertEquals(dataFileDetailResponse2.type, "text/pdf")
+        assertEquals(dataFileDetailResponse2.sizeMB, 0.0)
+        assertEquals(dataFileDetailResponse2.sha256, sha2)
+
+        assertEquals(dataFiles.cursor, dataFileDetailResponse2.id)
+        assertEquals(dataFiles.hasNext, false)
+    }
+


💡 Verification agent

🧩 Analysis chain

조회 테스트가 전역 상태에 취약함 + “커서” 의미가 구현과 불일치 가능성

현재 테스트는 cursor=0L, take=2로 호출하면서 nextCursor가 “마지막 항목의 id”가 되는 것을 검증합니다. 그러나 서비스 구현은 PageRequest.of(cursor.toInt(), take)로 보이며, 이는 “커서=페이지 인덱스” 해석입니다. 이름/의도(커서=마지막 id)와 구현(페이지 인덱스)이 상충합니다. 데이터가 누적되면 페이지 0의 결과가 달라져 테스트가 흔들릴 수 있습니다.

위 문제는 @PostConstruct 정리가 1회만 이뤄지는 점과 결합되어, 이전 테스트의 커밋 데이터가 섞일 경우 hasNext 검증도 흔들릴 수 있습니다.

다음을 확인해 주세요:

DataFileRepository에서 사용 중인 메서드 시그니처와 반환 타입(Page/Slice), DTO 필드명(dataFilePreviewResponseList인지 등).

컨트롤러/서비스에서 “cursor”를 “페이지 인덱스”로 쓰는지, “마지막 id”로 쓰는지.

스크립트(리포지토리 조사):

개선 제안:

단기: 현재 구현이 Page 기반이라면 파라미터명을 page로 명확히 하고 테스트 설명도 페이지네이션으로 정정.

중기: 진짜 커서(keyset) 기반으로 전환(예: id > lastId 조건 + take+1 조회로 hasNext 판단).

🏁 Script executed:

#!/bin/bash # Repository 메서드/DTO 필드 구조 확인 rg -n -A3 -B3 'interface DataFileRepository|class DataFileRepository' --type=kotlin rg -n -A3 -B3 'findBy(OrderBy)?Id|findByOrderById' --type=kotlin rg -n -A3 -B3 'class DataFileDetailResponseList|data class DataFileDetailResponseList' --type=kotlin rg -n -A3 -B3 'dataFilePreviewResponseList|dataFileDetailResponseList' --type=kotlin

Length of output: 11925

조회 테스트 불안정성 및 cursor 파라미터 의미 불일치

짚어본 결과

Repository: findByOrderById(pageable: Pageable): Slice<DataFile> → 페이지 기반 조회

Service:
fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList { val dataSlice = dataFileRepository.findByOrderById(PageRequest.of(cursor.toInt(), take)) // … return DataFileDetailResponseList( dataFilePreviewResponseList = …, cursor = lastId, hasNext = dataSlice.hasNext ) }
여기서 cursor를 페이지 인덱스로 사용(PageRequest.of(cursor.toInt(), take))하면서도, 테스트는 cursor=0L 호출 후 반환된 cursor를 “마지막 항목의 id”로 검증하고 있어 의미가 어긋납니다.

DTO:
data class DataFileDetailResponseList( val dataFilePreviewResponseList: List<DataFileDetailResponse>, val cursor: Long?, val hasNext: Boolean, )
필드명(dataFilePreviewResponseList, cursor, hasNext)은 테스트와 일치합니다.

문제

파라미터명·의도 불일치

cursor를 “마지막 id”로 기대하는 이름/테스트 vs 실제 “페이지 번호”로 처리

테스트 격리 부재

@PostConstruct로 초기 데이터가 한 번만 세팅되므로, 이전 테스트가 남긴 레코드가 섞이면 결과(hasNext, 페이지 0 내용) 검증이 불안정해집니다.

해결 제안

단기(필수)

Service 메서드 시그니처 변경:
- fun getDataFiles(cursor: Long, take: Int): … + fun getDataFiles(page: Int, size: Int): …

테스트: 파라미터명·설명 → 페이지네이션 기반으로 수정(page = 0, size = 2, 반환된 cursor → 다음 페이지 인덱스 검증 등)

테스트 격리 강화: @BeforeEach에 dataFileRepository.deleteAll() 또는 @DirtiesContext 적용

중기(권장)

Keyset 페이징 전환: WHERE id > lastId ORDER BY id LIMIT size + 1 방식으로 hasNext 판단 및 실제 cursor = lastId 사용

위 변경으로 이름·의도 일치, 테스트 안정성을 확보할 수 있습니다.

🤖 Prompt for AI Agents

In src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt around lines 217 to 267, the test assumes `cursor` is the last item id while the service uses it as a page index and also relies on shared data created elsewhere; update the test and setup to match the service pagination semantics and isolate data: add a @BeforeEach that calls dataFileRepository.deleteAll(), rename test variables to `page` and `size` (use page = 0, size = 2), call getDataFiles(page, size) and assert that returned cursor equals the next page index (or the service's documented page cursor) and hasNext matches the slice.hasNext; alternatively, if you want cursor to be last-id keyset paging, change the service signature to accept lastId and implement keyset query (WHERE id > lastId ORDER BY id LIMIT size+1) and adjust tests accordingly—pick one approach, make parameter names consistent, and ensure tests clear repository state before each run.

coderabbitai

Actionable comments posted: 5

🔭 Outside diff range comments (1)

src/main/kotlin/simplerag/ragback/global/storage/FakeS3Util.kt (1)
19-31: MultipartFile.getBytes 두 번 호출로 인한 불필요한 복사/할당

MultipartFile#getBytes()는 구현체에 따라 매 호출 시 새로운 배열을 반환할 수 있어, 현재 구현은 동일 바이트 배열을 2회 획득하면서 불필요한 복사/할당이 발생할 수 있습니다. 한 번만 읽어 로컬 변수에 담아 재사용하세요.

아래처럼 개선을 권장합니다:
-        val hash = sha256Hex(file.bytes).take(12)
+        val bytes = file.bytes
+        val hash = sha256Hex(bytes).take(12)
         val prefix = dir.label.trim('/')
         val key = "$prefix/${hash}_$clean"
 
-        store[key] = file.bytes
+        store[key] = bytes

♻️ Duplicate comments (3)

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt (1)
15-19: DTO 필드명 구체성 ↓: 리스트/커서 명칭을 간결하고 표준화된 이름으로 리네이밍 권장

현재 dataFileDetailResponseList, cursor는冗長/모호합니다. API 일관성과 가독성을 위해 data, nextCursor 등의 일반화된 명칭을 권장합니다. 호출부/테스트 수정 필요.
 data class DataFileDetailResponseList(
-    val dataFileDetailResponseList: List<DataFileDetailResponse>,
-    val cursor: Long?,
+    val data: List<DataFileDetailResponse>,
+    val nextCursor: Long?,
     val hasNext: Boolean,
 )
src/main/kotlin/simplerag/ragback/domain/document/controller/DataFileController.kt (1)
47-54: API 사용성 개선: cursor 선택적 + take 기본값 권장

첫 페이지 진입을 위해 cursor는 선택적이어야 하며, take는 합리적 기본값을 갖는 것이 일반적입니다. 또한 컨트롤러에서 null일 때 첫 페이지로 처리해 주세요.
 @GetMapping
 fun getDataFiles(
-    @RequestParam(name = "cursor") cursor: Long,
-    @RequestParam(name = "take") @Min(1) @Max(100) take: Int,
+    @RequestParam(name = "cursor", required = false) cursor: Long?,
+    @RequestParam(name = "take", defaultValue = "20") @Min(1) @Max(100) take: Int,
 ): ApiResponse<DataFileDetailResponseList> {
-    val data = dataFileService.getDataFiles(cursor, take)
+    val data = dataFileService.getDataFiles(cursor ?: 0L, take)
     return ApiResponse.ok(data)
 }
src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt (1)
80-82: 입력값 검증 추가 권장: 서비스 계층에서도 take/cursor 가드

컨트롤러에서 제약을 두더라도 서비스 계층에서의 최소 검증이 있으면 오용과 잠재적 오버플로를 방지할 수 있습니다.

아래처럼 간단한 require를 추가해 주세요:
 @Transactional(readOnly = true)
 fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList {
+    require(cursor >= 0) { "cursor must be >= 0" }
+    require(take in 1..100) { "take must be between 1 and 100" }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5b8c2bf and c663f22.

📒 Files selected for processing (7)

src/main/kotlin/simplerag/ragback/domain/document/controller/DataFileController.kt (2 hunks)
src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt (1 hunks)
src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt (1 hunks)
src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt (1 hunks)
src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt (3 hunks)
src/main/kotlin/simplerag/ragback/global/storage/FakeS3Util.kt (1 hunks)
src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt (3 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (7)

src/main/kotlin/simplerag/ragback/global/storage/FakeS3Util.kt (2)

src/main/kotlin/simplerag/ragback/global/util/S3UtilImpl.kt (1)

{ null } (108-108)

src/main/kotlin/simplerag/ragback/global/util/S3Util.kt (1)

delete (9-9)

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt (1)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFileTag.kt (1)

name (6-24)

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt (3)

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileRequestDTO.kt (2)

message (13-20)

min (7-11)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFileTag.kt (1)

name (6-24)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFile.kt (1)

name (7-32)

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt (2)

src/main/kotlin/simplerag/ragback/domain/index/entity/DataFileIndex.kt (1)

name (7-22)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFile.kt (1)

name (7-32)

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt (1)

src/main/kotlin/simplerag/ragback/domain/document/entity/DataFileTag.kt (1)

name (6-24)

src/main/kotlin/simplerag/ragback/domain/document/controller/DataFileController.kt (1)

src/main/kotlin/simplerag/ragback/domain/index/entity/DataFileIndex.kt (1)

name (7-22)

src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt (1)

src/main/kotlin/simplerag/ragback/global/util/FileConvertUtil.kt (1)

sha256Hex (13-16)

🔇 Additional comments (6)

src/main/kotlin/simplerag/ragback/global/storage/FakeS3Util.kt (1)

53-53: 테스트 편의 메서드 clear() 추가 LGTM

테스트 격리/정리 용도로 적절합니다. ConcurrentHashMap.clear()는 원자적으로 동작하며 테스트 프로필에만 활성화되어 안전합니다.

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt (1)

45-48: 태그 DTO 추가 LGTM

간결하고 충분한 스키마입니다. 도메인 의존성 없이 직렬화에 적합합니다.

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileRepository.kt (1)

11-11: 커서 기반 페이징 메서드 추가 LGTM

findByIdGreaterThanOrderById는 커서 기반(증가 방향) 페이징 요건에 부합합니다. 서비스에서 마지막 항목의 id를 nextCursor로 전달하면 일관되게 동작합니다.

src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt (1)

44-50: 테스트 정리 전략 전환이 적절함 (@beforeeach + S3 초기화 적용) — LGTM

매 테스트 전에 DB와 FakeS3 상태를 초기화하도록 변경한 점이 테스트 독립성과 신뢰성을 높입니다.

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt (2)

83-84: Page 인덱스 오용 이슈가 해소됨 — keyset(커서) 기반 조회로 전환 확인

findByIdGreaterThanOrderById(cursor, PageRequest.of(0, take))로 커서를 마지막 id로 해석하는 keyset 패턴을 적용한 점 좋습니다. 이전의 페이지 인덱스를 커서로 쓰던 문제를 해결했습니다.

98-100: nextCursor 계산과 hasNext 전달 방식 적절 — LGTM

lastOrNull()?.id로 커서를 설정하고, Slice.hasNext()를 그대로 노출하는 방식은 직관적입니다.

coderabbitai · 2025-08-17T14:00:18Z

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt

+data class DataFileDetailResponse(
+    var id: Long?,
+    val title: String,
+    val type: String,
+    val lastModified: LocalDateTime,
+    val sizeMB: Double,
+    val tags: List<TagDTO>,
+    val sha256: String,
+) {


🛠️ Refactor suggestion

DTO 불변성 유지: id는 가변(var)보다 불변(val)이 적절

응답 DTO에서 id가 변경될 이유가 없으므로 var → val로 불변성을 유지하는 것이 안전합니다.

data class DataFileDetailResponse( - var id: Long?, + val id: Long?, val title: String, val type: String, val lastModified: LocalDateTime, val sizeMB: Double, val tags: List<TagDTO>, val sha256: String, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

data class DataFileDetailResponse(

var id: Long?,

val title: String,

val type: String,

val lastModified: LocalDateTime,

val sizeMB: Double,

val tags: List<TagDTO>,

val sha256: String,

) {

data class DataFileDetailResponse(

val id: Long?,

val title: String,

val type: String,

val lastModified: LocalDateTime,

val sizeMB: Double,

val tags: List<TagDTO>,

val sha256: String,

) {

// existing class body...

}

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt around lines 21 to 29, the DTO declares `id` as a mutable `var` but it should be immutable; change `var id: Long?` to `val id: Long?` to preserve DTO immutability and update any code that mutates this property (replace mutations with constructor-based creation or copies) and adjust affected call sites/tests to use the immutable property.

coderabbitai · 2025-08-17T14:00:18Z

src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt

+        fun of(dataFile: DataFile, tags: List<TagDTO>): DataFileDetailResponse {
+            return DataFileDetailResponse(
+                dataFile.id,
+                dataFile.title,
+                dataFile.type,
+                dataFile.updatedAt,
+                dataFile.sizeBytes / (1024.0 * 1024.0),
+                tags,
+                dataFile.sha256,
+            )
+        }


🧹 Nitpick (assertive)

매직 넘버 제거: MB 변환 상수로 추출

1024.0 * 1024.0 하드코딩 대신 상수로 추출하여 의미를 드러내고 재사용성을 높이세요.

data class DataFileDetailResponse( @@ ) { companion object { + private const val BYTES_PER_MB = 1_048_576.0 fun of(dataFile: DataFile, tags: List<TagDTO>): DataFileDetailResponse { return DataFileDetailResponse( dataFile.id, dataFile.title, dataFile.type, dataFile.updatedAt, - dataFile.sizeBytes / (1024.0 * 1024.0), + dataFile.sizeBytes / BYTES_PER_MB, tags, dataFile.sha256, ) } } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fun of(dataFile: DataFile, tags: List<TagDTO>): DataFileDetailResponse {

return DataFileDetailResponse(

dataFile.id,

dataFile.title,

dataFile.type,

dataFile.updatedAt,

dataFile.sizeBytes / (1024.0 * 1024.0),

tags,

dataFile.sha256,

)

}

data class DataFileDetailResponse(

// existing properties…

) {

companion object {

private const val BYTES_PER_MB = 1_048_576.0

fun of(dataFile: DataFile, tags: List<TagDTO>): DataFileDetailResponse {

return DataFileDetailResponse(

dataFile.id,

dataFile.title,

dataFile.type,

dataFile.updatedAt,

dataFile.sizeBytes / BYTES_PER_MB,

tags,

dataFile.sha256,

)

}

}

}

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/dto/DataFileResponseDTO.kt around lines 31 to 41, the code uses the magic number 1024.0 * 1024.0 to convert bytes to megabytes; extract that expression into a well-named constant (e.g., BYTES_PER_MB or BYTES_IN_MB) declared near the top of the file or inside the companion object as a Double, then replace the inline calculation with division by that constant to clarify intent and enable reuse.

coderabbitai · 2025-08-17T14:00:18Z

src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt

+    @Query("""
+        SELECT dft
+        FROM DataFileTag dft
+        JOIN FETCH dft.tag t
+        WHERE dft.dataFile = :dataFile
+    """)
+    fun findTagsByDataFile(@Param("dataFile") dataFile: DataFile): List<DataFileTag>


🛠️ Refactor suggestion

N+1 가능성: 목록 조회 시 파일별 개별 태그 조회는 비효율적 — 배치 페치 메서드 추가 권장

페이지 사이즈만큼 findTagsByDataFile가 반복 호출될 수 있어 쿼리 수가 N으로 증가합니다. 한 번에 여러 파일의 태그를 패치 조인으로 가져오는 배치 메서드를 추가하고 서비스에서 그룹핑하세요.

아래 메서드를 추가(서비스는 dataFile.id 기준 그룹핑)하면 쿼리를 1회로 줄일 수 있습니다:

@Query( """ SELECT dft FROM DataFileTag dft JOIN FETCH dft.tag t WHERE dft.dataFile IN :dataFiles """ ) fun findAllByDataFileInFetchTag(@Param("dataFiles") dataFiles: List<DataFile>): List<DataFileTag>

서비스 측 그룹핑 예시:

val tagsByFileId = dataFileTagRepository .findAllByDataFileInFetchTag(files) .groupBy({ it.dataFile.id!! }, { TagDTO(it.tag.id, it.tag.name) })

참고: default_batch_fetch_size가 도움이 되긴 하지만, 위와 같은 명시적 패치 조인이 가장 확실하게 N+1을 제거합니다.

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt around lines 12–18, the current findTagsByDataFile causes N+1 when called per file; add a batch fetch method named findAllByDataFileInFetchTag that accepts a List<DataFile> and returns List<DataFileTag>, implemented with a JPQL query that uses JOIN FETCH on dft.tag and a WHERE dft.dataFile IN :dataFiles to load all tags for multiple files in one query; then update the service to call this new repository method and group the results by dataFile.id to build TagDTO lists per file (i.e., groupBy { it.dataFile.id } mapping to TagDTO).

coderabbitai · 2025-08-17T14:00:18Z

src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt

+        val dataFileList: MutableList<DataFileDetailResponse> = ArrayList()
+        dataSlice.forEach{ dataFile ->
+            val dataFileTags: List<DataFileTag> = dataFileTagRepository.findTagsByDataFile(dataFile)
+
+            val tagDtos: List<TagDTO> = dataFileTags.map{
+                dataFileTag ->
+                val tag = dataFileTag.tag
+                TagDTO(tag.id, tag.name)
+            }
+
+            dataFileList.add(DataFileDetailResponse.of(dataFile, tagDtos))
+        }


🛠️ Refactor suggestion

N+1 쿼리 발생 — 태그를 다건 조회로 한 번에 가져오도록 리팩터링 권장

현재는 Slice의 각 DataFile마다 findTagsByDataFile을 호출합니다. 목록 크기에 비례해 쿼리 수가 증가하며, 배치 페치 설정만으로는 한계가 있습니다. 다건 조회 후 groupBy하여 1+1 패턴으로 줄이세요.

아래와 같이 서비스 코드를 변경하고, 리포지토리에 다건 조회 메서드를 추가해 주세요.

서비스 변경(해당 범위 교체):

- val dataFileList: MutableList<DataFileDetailResponse> = ArrayList() - dataSlice.forEach{ dataFile -> - val dataFileTags: List<DataFileTag> = dataFileTagRepository.findTagsByDataFile(dataFile) - - val tagDtos: List<TagDTO> = dataFileTags.map{ - dataFileTag -> - val tag = dataFileTag.tag - TagDTO(tag.id, tag.name) - } - - dataFileList.add(DataFileDetailResponse.of(dataFile, tagDtos)) - } + val ids = dataSlice.mapNotNull { it.id } + val tagMap: Map<Long, List<TagDTO>> = + dataFileTagRepository.findByDataFileIdIn(ids) + .groupBy({ requireNotNull(it.dataFile.id) }) { TagDTO(it.tag.id, it.tag.name) } + + val dataFileList: List<DataFileDetailResponse> = dataSlice.map { dataFile -> + val tagDtos = tagMap[dataFile.id] ?: emptyList() + DataFileDetailResponse.of(dataFile, tagDtos) + }.toList()

리포지토리 추가(별도 파일 수정 필요):

// src/main/kotlin/simplerag/ragback/domain/document/repository/DataFileTagRepository.kt @Query("select dft from DataFileTag dft join fetch dft.tag where dft.dataFile.id in :ids") fun findByDataFileIdIn(@Param("ids") ids: List<Long>): List<DataFileTag>

이 방식은 (데이터 파일 슬라이스 1회 + 태그 일괄 1회)로 고정되어 규모가 커져도 효율적입니다.

🤖 Prompt for AI Agents

In src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt around lines 85 to 96, the code issues N+1 queries by calling findTagsByDataFile for each DataFile; replace that loop with a batched approach: call a new repository method that returns all DataFileTag for a list of dataFile IDs (e.g., findByDataFileIdIn(ids)), group the returned DataFileTag list by dataFile.id, then map each DataFile to its tag DTOs using the grouped map and construct DataFileDetailResponse.of(dataFile, tagDtos) for each item; also add the repository method signature in DataFileTagRepository as described in the review so tags are fetched in one query.

어짜피 fetch join으로 가져오니까 상관없지 않나요?

@catturtle123 좋은 질문입니다! 하지만 fetch join과 N+1 쿼리 문제는 서로 다른 개념입니다.

현재 상황 분석:

현재 코드에서는 여전히 N+1 쿼리가 발생합니다:

1번 쿼리: dataFileRepository.findByIdGreaterThanOrderById() - DataFile 목록 조회

N번 쿼리: dataFileTagRepository.findTagsByDataFile() - 각 DataFile마다 개별 호출

Fetch Join의 역할:

Fetch join은 단일 쿼리 내에서 연관 엔티티를 즉시 로딩하여 지연 로딩으로 인한 추가 쿼리를 방지합니다

하지만 메서드 자체가 여러 번 호출되는 것은 막지 못합니다

예시:
DataFile이 3개 있다면:

-- 1번: DataFile 목록 조회 SELECT df.* FROM data_file df WHERE df.id > ? ORDER BY df.id LIMIT 3 -- 2번: 첫 번째 DataFile의 태그 조회 (fetch join 적용) SELECT dft.*, t.* FROM data_file_tag dft JOIN tag t ON dft.tag_id = t.id WHERE dft.data_file_id = 1 -- 3번: 두 번째 DataFile의 태그 조회 (fetch join 적용) SELECT dft.*, t.* FROM data_file_tag dft JOIN tag t ON dft.tag_id = t.id WHERE dft.data_file_id = 2 -- 4번: 세 번째 DataFile의 태그 조회 (fetch join 적용) SELECT dft.*, t.* FROM data_file_tag dft JOIN tag t ON dft.tag_id = t.id WHERE dft.data_file_id = 3

총 4번의 데이터베이스 라운드트립이 발생합니다.

권장 방식:

-- 1번: DataFile 목록 조회 SELECT df.* FROM data_file df WHERE df.id > ? ORDER BY df.id LIMIT 3 -- 2번: 모든 DataFile의 태그를 한 번에 조회 SELECT dft.*, t.* FROM data_file_tag dft JOIN tag t ON dft.tag_id = t.id WHERE dft.data_file_id IN (1,2,3)

총 2번의 데이터베이스 라운드트립으로 줄어듭니다.

데이터 양이 많아질수록 성능 차이가 크게 벌어지므로, 일괄 조회 방식으로 리팩터링하는 것을 권장드립니다.

✏️ Learnings added

Learnt from: catturtle123 PR: Simple-RAG/BACK#7 File: src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt:85-96 Timestamp: 2025-08-17T17:04:32.878Z Learning: Fetch join은 단일 쿼리 내에서 연관 엔티티의 지연 로딩을 방지하지만, 메서드가 루프에서 여러 번 호출되는 N+1 쿼리 문제는 해결하지 못함. 일괄 조회 후 groupBy 방식으로 1+1 쿼리 패턴을 만들어야 함.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

좋습니다. 하지만 batch_size를 등록해놨으니 N+1까지는 아니지 않나요?

@catturtle123 좋은 지적입니다! batch_size 설정에 대한 이해가 정확하시네요. 하지만 이 경우에는 완전히 해결되지 않습니다.

Batch Size의 동작 방식:

default_batch_fetch_size: 500은 주로 지연 로딩(lazy loading) 상황에서 동일한 연관관계를 여러 번 로드할 때 배치로 묶어서 처리합니다

예: 엔티티의 연관관계를 .tags 프로퍼티로 접근할 때 Hibernate가 자동으로 배치 처리

현재 코드의 한계:

dataSlice.forEach{ dataFile -> val dataFileTags: List<DataFileTag> = dataFileTagRepository.findTagsByDataFile(dataFile) // 명시적 메서드 호출 }

각 findTagsByDataFile() 호출은 별개의 쿼리 실행입니다

Repository 메서드를 명시적으로 호출하므로 Hibernate가 자동으로 배치할 수 없습니다

결과적으로 여전히 DataFile 개수만큼 쿼리가 실행됩니다

실제 쿼리 수:

batch_size 없이: 1 + N번 (완전한 N+1)

batch_size 있어도: 1 + N번 (동일함)

진정한 해결책:

val ids = dataSlice.mapNotNull { it.id } val tagMap = dataFileTagRepository.findByDataFileIdIn(ids) // 1번의 IN 쿼리

이 방식은 1 + 1 = 총 2번의 쿼리로 고정됩니다

batch_size는 좋은 최적화이지만, 명시적인 일괄 조회 패턴이 더 효과적입니다!

🧠 Learnings used

Learnt from: catturtle123 PR: Simple-RAG/BACK#7 File: src/main/kotlin/simplerag/ragback/domain/document/service/DataFileService.kt:85-96 Timestamp: 2025-08-17T17:04:32.878Z Learning: Fetch join은 단일 쿼리 내에서 연관 엔티티의 지연 로딩을 방지하지만, 메서드가 루프에서 여러 번 호출되는 N+1 쿼리 문제는 해결하지 못함. 일괄 조회 후 groupBy 방식으로 1+1 쿼리 패턴을 만들어야 함.

coderabbitai · 2025-08-17T14:00:18Z

src/test/kotlin/simplerag/ragback/domain/document/service/DataFileServiceTest.kt

+    @Test
+    @DisplayName("데이터 조회가 잘 된다")
+    @Transactional
+    fun getDataFilesOK() {
+        // given
+        val bytes1 = "test1".toByteArray()
+        val sha1 = sha256Hex(bytes1)
+        val bytes2 = "test2".toByteArray()
+        val sha2 = sha256Hex(bytes2)
+        dataFileRepository.saveAll(
+            listOf(
+                DataFile(
+                title = "exists",
+                type = "text/plain",
+                sizeBytes = 0,
+                sha256 = sha1,
+                fileUrl = "fake://original/exists.txt",
+            ),
+                DataFile(
+                    title = "exists2",
+                    type = "text/pdf",
+                    sizeBytes = 0,
+                    sha256 = sha2,
+                    fileUrl = "fake://original/exists.txt",
+                )
+            )
+        )
+
+        val cursor = 0L
+        val take = 2
+
+        // when
+        val dataFiles = dataFileService.getDataFiles(cursor, take)
+
+        // then
+        val dataFileDetailResponse = dataFiles.dataFileDetailResponseList[0]
+        assertEquals(dataFileDetailResponse.title, "exists")
+        assertEquals(dataFileDetailResponse.type, "text/plain")
+        assertEquals(dataFileDetailResponse.sizeMB, 0.0)
+        assertEquals(dataFileDetailResponse.sha256, sha1)
+
+        val dataFileDetailResponse2 = dataFiles.dataFileDetailResponseList[1]
+        assertEquals(dataFileDetailResponse2.title, "exists2")
+        assertEquals(dataFileDetailResponse2.type, "text/pdf")
+        assertEquals(dataFileDetailResponse2.sizeMB, 0.0)
+        assertEquals(dataFileDetailResponse2.sha256, sha2)
+
+        assertEquals(dataFiles.cursor, dataFileDetailResponse2.id)
+        assertEquals(dataFiles.hasNext, false)
+    }


🧹 Nitpick (assertive)

조회 해피패스만 검증됨 — hasNext=true 및 커서 후속 조회 케이스 테스트 추가 제안

현재 테스트는 take=2, 레코드=2로 hasNext=false와 nextCursor=마지막 id만 검증합니다. 다음도 함께 커버해 주세요:

hasNext=true인 경우(take < 총 레코드 수)와 nextCursor가 적절히 설정되는지

반환된 cursor를 이용해 후속 페이지를 조회했을 때 기대 데이터가 나오는지

sizeMB 변환/반올림(버그 수정 이력)에 대한 비영(0이 아닌) 값 검증

원하시면 아래와 같이 테스트를 추가할 수 있습니다:

@Test @DisplayName("커서 기반 조회 - hasNext=true, cursor로 후속 페이지 조회") @Transactional fun getDataFiles_hasNext_and_followUpCursor() { // given val f1 = dataFileRepository.save(DataFile("t1", "text/plain", 0, "sha1", "u1")) val f2 = dataFileRepository.save(DataFile("t2", "text/plain", 0, "sha2", "u2")) val f3 = dataFileRepository.save(DataFile("t3", "text/plain", 0, "sha3", "u3")) // when: 첫 페이지 val first = dataFileService.getDataFiles(cursor = 0L, take = 2) // then assertEquals(2, first.dataFileDetailResponseList.size) assertTrue(first.hasNext) val next = requireNotNull(first.cursor) // when: 후속 페이지 val second = dataFileService.getDataFiles(cursor = next, take = 2) // then: 남은 1건 assertEquals(1, second.dataFileDetailResponseList.size) assertFalse(second.hasNext) assertEquals("t3", second.dataFileDetailResponseList.first().title) }

또한 sizeMB 변환 검증 예시(반올림 정책에 맞게 기대값 조정 필요):

@Test @DisplayName("sizeBytes -> sizeMB 변환 검증") @Transactional fun getDataFiles_sizeMB_conversion() { val oneAndHalfMB = 1_572_864L // 1.5 MiB val sha = "shaX" dataFileRepository.save(DataFile("mb", "application/octet-stream", oneAndHalfMB, sha, "u")) val res = dataFileService.getDataFiles(cursor = 0L, take = 1) val item = res.dataFileDetailResponseList.first() // 예: 1.5 MB로 표현된다면 아래 기대값 조정 assertTrue(item.sizeMB in 1.49..1.51, "sizeMB 변환/반올림을 확인해 주세요") }

추가 테스트 작성이 필요하시면 전체 코드 패치로 도와드리겠습니다.

catturtle123 added 4 commits August 17, 2025 17:12

✨ feature: 데이터 조회 로직

c3f0362

🐛 fix: MB 변환 수정

d882a21

🧪 test: 조회 테스트 추가

81a07b8

Merge branch 'develop' into feature/#3-document-read

5b8c2bf

catturtle123 self-assigned this Aug 17, 2025

catturtle123 added the feature New feature or request label Aug 17, 2025

catturtle123 linked an issue Aug 17, 2025 that may be closed by this pull request

✨ [Feature] 문서 조회 추가 #3

Closed

2 tasks

coderabbitai bot reviewed Aug 17, 2025

View reviewed changes

catturtle123 added 6 commits August 17, 2025 22:43

🐛 fix: 커서기반 페이지 네이션 수정

9455d92

♻️ refactor: take 검증 추가

d641455

♻️ refactor: response dto 이름 수정

4376657

⚡ perf: Tag N+1 문제 해결

179442b

♻️ refactor: cursor 구하는 로직 수정

3c46afa

🐛 fix: 테스트 환경 버그 해결

c663f22

coderabbitai bot reviewed Aug 17, 2025

View reviewed changes

catturtle123 merged commit 56aa92c into develop Aug 17, 2025
1 check passed

catturtle123 deleted the feature/#3-document-read branch August 17, 2025 14:01

This was referenced Aug 17, 2025

✨ Feature: dataFile 삭제 기능 추가 #8

Merged

♻️ Refactor: index, dataFile converting refactoring #15

Merged

-    fun findByOrderById(pageable: Pageable): Slice<DataFile>
+class DataFileService(
+    /*…*/
+) {
+    @Transactional(readOnly = true)
+    fun getDataFiles(cursor: Long, take: Int): DataFileDetailResponseList {
+        // 첫 페이지는 cursor=0L로 간주하고, 이후에는 cursorId를 기준으로 조회
+        val pageable = PageRequest.of(0, take)
+        val dataSlice = dataFileRepository.findByIdGreaterThanOrderById(cursor, pageable)
+        val dataFileList: MutableList<DataFileDetailResponse> = ArrayList()
+        dataSlice.forEach { dataFile -> /*…*/ }
+        // …
+    }
+}

✨ Feature: dataFile 읽기 추가 #7

✨ Feature: dataFile 읽기 추가 #7

Uh oh!

Conversation

catturtle123 commented Aug 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 PR 개요

🔍 관련 이슈

✨ 변경 사항

📸 스크린샷 / 동작 확인 결과 (선택)

✅ 체크리스트

🗒️ 추가 메모

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

catturtle123 Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

catturtle123 commented Aug 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 17, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Aug 17, 2025 •

edited

Loading