diff --git a/.kiro/specs/import-history-sidebar/design.md b/.kiro/specs/import-history-sidebar/design.md index 09fde6d7a..26437bd30 100644 --- a/.kiro/specs/import-history-sidebar/design.md +++ b/.kiro/specs/import-history-sidebar/design.md @@ -72,13 +72,13 @@ graph TD - Show loading states and empty states appropriately - Integrate with existing workspace selection functionality - Add "View All Imports" button below recent imports list -- Add "Import Documents" button to open ImportModal +- Add "Import Documents" button to open ImportFlow **Modified Structure:** ```vue diff --git a/.kiro/specs/import-history-sidebar/requirements.md b/.kiro/specs/import-history-sidebar/requirements.md index c0e749b0f..bb0f1b8db 100644 --- a/.kiro/specs/import-history-sidebar/requirements.md +++ b/.kiro/specs/import-history-sidebar/requirements.md @@ -85,7 +85,7 @@ The feature replaces the current example datasets section in the home page sideb 1. WHEN I view the Recent Imports sidebar THEN the system SHALL display a "View All Imports" button below the recent imports list 2. WHEN I click "View All Imports" THEN the system SHALL open the ImportHistoryList modal showing the complete import history for the workspace 3. WHEN I view the Recent Imports sidebar THEN the system SHALL display an "Import Documents" button -4. WHEN I click "Import Documents" THEN the system SHALL open the ImportModal for uploading new documents +4. WHEN I click "Import Documents" THEN the system SHALL open the ImportFlow for uploading new documents 5. WHEN the ImportHistoryList modal is open THEN the system SHALL support all existing functionality (filtering, pagination, viewing details) 6. WHEN I close the ImportHistoryList modal THEN the system SHALL return to the home page with the Recent Imports sidebar still visible diff --git a/.kiro/specs/import-history-sidebar/tasks.md b/.kiro/specs/import-history-sidebar/tasks.md index 3b15e85ec..959b629d7 100644 --- a/.kiro/specs/import-history-sidebar/tasks.md +++ b/.kiro/specs/import-history-sidebar/tasks.md @@ -103,13 +103,13 @@ - Update `extralit-frontend/pages/index.vue` to replace example datasets with RecentImports component - Add event handlers for import selection and modal opening - Integrate with existing workspace selection functionality - - Maintain existing ImportModal and ImportHistoryList modal functionality + - Maintain existing ImportFlow and ImportHistoryList modal functionality - _Requirements: 1.1, 1.5, 2.1, 6.2, 6.4_ - [x] 8.2 Update home page view model - Modify useHomeViewModel to handle Recent Imports integration - Add navigation methods for import configuration routing - - Integrate modal opening logic for ImportHistoryList and ImportModal + - Integrate modal opening logic for ImportHistoryList and ImportFlow - _Requirements: 2.1, 6.2, 6.4, 6.5_ - [x] 9. Add Import Configuration Route diff --git a/.kiro/specs/papers-library-importer/design.md b/.kiro/specs/papers-library-importer/design.md index f2b60a543..7e0f8a045 100644 --- a/.kiro/specs/papers-library-importer/design.md +++ b/.kiro/specs/papers-library-importer/design.md @@ -2,9 +2,11 @@ ## Overview -The Papers Library Importer feature enables researchers to import their existing reference libraries from .bib files and PDF folders into Extralit workspaces. The system leverages the existing document upload endpoint (`POST /documents`) and job queue system to process bibliographic metadata from .bib files, match PDF files to references, and provide a user-friendly interface for reviewing and confirming imports before executing bulk operations. +The Papers Library Importer feature enables researchers to import their existing reference libraries from bibliography files (.bib or .csv) and PDF folders into Extralit workspaces. The system leverages the existing document upload endpoint (`POST /documents`) and job queue system to process bibliographic metadata from various formats, match PDF files to references using advanced path matching algorithms, and provide a user-friendly interface for reviewing and confirming imports before executing bulk operations. -**Generalized Tabular Import Support**: The import system is designed to handle tabular data beyond just BibTeX files. The core functionality supports CSV imports and other structured data formats by storing imported data as dataframes with schema information. This enables future expansion to support various research data import formats while maintaining consistent processing workflows. +**Generalized Tabular Import Support**: The import system is designed to handle multiple tabular data formats including BibTeX (.bib) and CSV files. The core functionality supports flexible column mapping for CSV imports and stores imported data as dataframes with schema information. This enables consistent processing workflows across different research data import formats. + +**Enhanced PDF Matching**: The system uses sophisticated file matching algorithms including maximum prefix path matching, exact filename matching, and fuzzy string matching to associate PDF files with bibliography entries. Users can import references with or without associated PDF files. The design follows Extralit's existing patterns: context-based backend architecture, FastAPI endpoints with proper authorization, Vue.js frontend components, and the existing RQ-based asynchronous job processing system for bulk operations. @@ -12,7 +14,7 @@ The design follows Extralit's existing patterns: context-based backend architect ### High-Level Flow -1. **Frontend Processing Phase**: User uploads .bib file and PDFs to frontend, which parses BibTeX entries into generic dataframe format and matches files to references +1. **Frontend Processing Phase**: User uploads bibliography file (.bib or .csv) and PDFs to frontend, which parses entries into generic dataframe format and matches files to references using advanced path matching 2. **Analysis Phase**: Frontend sends file metadata (not file contents) to backend for add/update/skip status analysis 3. **Preview Phase**: Frontend displays import preview with status for each document based on server analysis 4. **Bulk Upload Phase**: User confirms import, frontend sends paginated requests to bulk upload endpoint with actual file contents @@ -23,8 +25,8 @@ The design follows Extralit's existing patterns: context-based backend architect ```mermaid graph TD - A[Frontend Upload Component] --> B[Frontend BibTeX Parser] - A --> C[Frontend File Matcher] + A[Frontend Upload Component] --> B[Frontend Bibliography Parser (.bib/.csv)] + A --> C[Frontend Advanced File Matcher] B --> D[File Metadata Analysis Request] B --> E[Generic Dataframe Conversion] C --> D @@ -144,7 +146,7 @@ Note to reuse existing styles in extralit-frontend/assets/scss/base/base.scss, e **Workspace Selection Integration:** - Modify WorkspacesFilter component to support single workspace selection instead of multi-select -- Pass selected workspace ID to ImportModal component for import analysis +- Pass selected workspace ID to ImportFlow component for import analysis - Ensure workspace context is maintained throughout the import workflow #### 2. FlowModal Base Component (`extralit-frontend/components/base/base-flow-modal/BaseFlowModal.vue`) @@ -202,10 +204,10 @@ interface FlowModalProps { - Smooth transitions between steps - Loading states and disabled button styling -#### 3. Import Modal Workflow (`extralit-frontend/components/features/import/ImportModal.vue`) +#### 3. Import Modal Workflow (`extralit-frontend/components/features/import/ImportFlow.vue`) **Full-page modal using new BaseFlowModal component with multi-step workflow:** -- Step 1: Upload Bibliography File (.bib file upload) +- Step 1: Upload Bibliography File (.bib or .csv file upload) - Step 2: Upload Full-Text PDFs (multiple PDF file upload) - Step 3: Import Analysis & Selection (table with toggle functionality) - Step 4: Batch Upload Progress (live progress tracking) @@ -216,23 +218,37 @@ interface FlowModalProps { - Passes workspace ID to ImportAnalysisTable for backend analysis requests - Maintains workspace context throughout the import workflow +**Flow Control Improvements:** +- Requires confirmation to close modal during import process +- No confirmation required after successful completion +- Preserves uploaded data when navigating between steps +- Refreshes recent import list on home screen when modal closes after completion +- Supports flexible upload order (bibliography or PDFs can be uploaded first) + #### 3. Upload Steps Components -**Step 1: Bibliography Upload (`extralit-frontend/components/features/import/ImportBibUpload.vue`)** -- Single .bib file upload with drag-and-drop or file picker +**Step 1: Bibliography Upload (`extralit-frontend/components/features/import/ImportFileUpload.vue`)** +- Combined .bib and .csv file upload with drag-and-drop or file picker - Support for ";"-separated values (especially the `file` attribute in zotero_export.bib) -- Parsing preview of dataframe columns parsed from the .bib file, +- **CSV Column Selection**: When CSV is uploaded, display column selection interface allowing user to: + - Select reference/ID column (primary key) + - Select files column for PDF matching + - Preview first few rows of data +- Parsing preview of dataframe columns parsed from the bibliography file - Display upload status and reference count +- Allow flexible upload order (bibliography first or PDFs first) -**Step 2: PDF Upload (`extralit-frontend/components/features/import/ImportPdfUpload.vue`)** +**Step 2: PDF Upload (integrated into ImportFileUpload.vue)** - Multiple PDF file upload with drag-and-drop or folder selection -- File path matching preview with bibliography entries +- Advanced file path matching preview with bibliography entries using maximum prefix matching - Upload progress and file validation - Summary status showing matched/unmatched files +- Progressive file addition with deduplication **Dependencies:** - `vue-dropzone` or similar for file uploads - JavaScript BibTeX parser library (e.g., `bibtex-parse-js` or `@retorquere/bibtex-parser`) +- Performant CSV parser library (e.g., `papaparse`) Example BibTeX files: @@ -301,14 +317,17 @@ Example BibTeX files: **Features using new simple table component:** - Uses `GetImportAnalysisUseCase` from `~/v1/domain/usecases/get-import-analysis-use-case.ts` for backend communication -- Uses `useImportAnalysisViewModel` for reactive state management and API integration +- Uses `useImportAnalysisTableViewModel` for reactive state management and API integration - Imports backend API types from `~/v1/domain/entities/import/ImportAnalysis.ts` - Imports UI component types from `./types.ts` for table configuration and component state - Tabular display with columns: Reference (first column freeze), and Files, Import Status (last column freeze), while the rest of the columns imported from are sorted Title, Authors, Year, to the rest of the table -- Toggle functionality for each reference to select Add/Update/Skip +- Toggle functionality for each reference to select Add/Update/Skip/Ignore - User can toggle from Add or Update to Ignore, or back - Status indicators with color coding (Add: green, Update: blue, Skip: gray, Ignore: gray, Failed: red) - Filterable columns on the status indicator +- **Import Filter Options**: Toggle between "Import All References" and "Import Only References with PDFs" +- When "Import Only References with PDFs" is selected, references without matched files are automatically set to "Ignore" status +- When "Import All References" is selected, references without matched files can be imported as metadata-only entries - Sends POST requests to `/api/v1/imports/analyze` with `ImportAnalysisRequest` to prepopulate Import Status column - Receives workspace ID as prop and passes it to the analysis use case - Automatically triggers analysis when dataframe data is available and workspace ID is provided @@ -434,7 +453,7 @@ class ImportHistoryResponse(BaseModel): """Response schema for import history creation and retrieval.""" id: UUID = Field(..., description="Import history record ID") workspace_id: UUID = Field(..., description="Workspace ID") - user_id: UUID = Field(..., description="User ID who created the import") + username: str = Field(..., description="User who created the import") filename: str = Field(..., description="Import filename") created_at: datetime = Field(..., description="Creation timestamp") data: Optional[Dict] = Field(None, description="Tabular dataframe data (only in detailed view)") @@ -543,8 +562,9 @@ The import system processes tabular data (BibTeX, CSV, etc.) into a standardized - Type inference applied automatically (string, integer, float) - Schema generated dynamically based on available fields -**Future CSV Support:** -- First column as primary key (configurable) +**CSV Support:** +- User-selectable reference column as primary key +- User-selectable files column for PDF matching - Column headers map to dataframe field names - Type inference for string, integer, float fields - Flexible schema definition for different data sources @@ -555,10 +575,12 @@ The import system processes tabular data (BibTeX, CSV, etc.) into a standardized - Preserves all original metadata without field-specific mapping requirements ### PDF-to-Reference Matching Logic -1. **Exact Match**: PDF filename matches Reference exactly -2. **Partial Match**: PDF filename contains Reference -3. **Fuzzy Match**: Use string similarity for close matches -4. **Manual Association**: Allow user to manually associate files +1. **Maximum Prefix Path Match**: PDF file path has maximum prefix match with bibliography entry file path (highest priority) +2. **Exact Match**: PDF filename matches Reference exactly +3. **File Field Match**: PDF filename matches parsed file paths from bibliography entry +4. **Fuzzy Title Match**: PDF filename contains significant words from reference title (lowest priority) +5. **Progressive File Addition**: Support for adding multiple PDF files progressively with proper deduplication +6. **Multiple Files per Reference**: Handle cases where one reference matches multiple PDF files correctly ## Error Handling @@ -620,10 +642,10 @@ extralit-frontend/ │ └── WorkspaceSelector.vue # Modified for single workspace selection └── components/features/import/ ├── types.ts # UI component types + re-exports - ├── ImportModal.vue # Main workflow modal (receives workspace ID) + ├── ImportFlow.vue # Main workflow modal (receives workspace ID) ├── ImportFileUpload.vue # Step 1 & 2: File uploads ├── ImportAnalysisTable.vue # Step 3: Analysis & selection (uses workspace ID) - ├── useImportAnalysisViewModel.ts # View model that calls get-import-analysis-use-case.ts + ├── useImportAnalysisTableViewModel.ts # View model that calls get-import-analysis-use-case.ts └── ImportBatchProgress.vue # Step 4: Upload progress ``` diff --git a/.kiro/specs/papers-library-importer/requirements.md b/.kiro/specs/papers-library-importer/requirements.md index 88c921ed3..5fccb89c2 100644 --- a/.kiro/specs/papers-library-importer/requirements.md +++ b/.kiro/specs/papers-library-importer/requirements.md @@ -10,20 +10,22 @@ The feature consists of two main components: a backend import service that proce ### Requirement 1 -**User Story:** As a researcher, I want to upload a .bib file and folder of PDFs to import my reference library into an Extralit workspace, so that I can use my existing document collection for extraction workflows and reference the documents in during the annotation process. +**User Story:** As a researcher, I want to upload a bibliography file (.bib or .csv) and folder of PDFs to import my reference library into an Extralit workspace, so that I can use my existing document collection for extraction workflows and reference the documents in during the annotation process. #### Acceptance Criteria 1. WHEN I upload a .bib file THEN the system SHALL parse the bibliographic entries and extract metadata (title, authors, venue, year, DOI, PMID, reference) -2. WHEN I upload a folder of PDF files THEN the system SHALL process each PDF and attempt to match it with bibliographic entries -3. WHEN a PDF filename matches a .bib entry reference THEN the system SHALL associate the PDF with that bibliographic entry -4. WHEN I provide a collection tag THEN the system SHALL add this tag to all imported documents' metadata -5. WHEN documents are processed THEN the system SHALL store the reference as the unique identifier for deduplication -6. IF a PDF cannot be matched to a .bib entry THEN the system SHALL mark it as "failed" and provide error details +2. WHEN I upload a .csv file THEN the system SHALL parse the tabular data and allow me to select the reference column and files column for PDF matching +3. WHEN I upload a folder of PDF files THEN the system SHALL process each PDF and attempt to match it with bibliographic entries using maximum prefix path matching +4. WHEN a PDF file path has a maximum prefix match with a bibliography entry file path THEN the system SHALL associate the PDF with that bibliographic entry +5. WHEN I provide a collection tag THEN the system SHALL add this tag to all imported documents' metadata +6. WHEN documents are processed THEN the system SHALL store the reference as the unique identifier for deduplication +7. WHEN I upload files in any order (bibliography first or PDFs first) THEN the system SHALL allow me to proceed to the next step +8. IF a PDF cannot be matched to a bibliography entry THEN the system SHALL mark it as unmatched but still allow import ### Requirement 2 -**User Story:** As a researcher, I want to see a preview of all documents to be imported with their import status, so that I can review and confirm the import before committing changes. +**User Story:** As a researcher, I want to see a preview of all documents to be imported with their import status and choose whether to import references without PDFs, so that I can review and confirm the import before committing changes. #### Acceptance Criteria @@ -32,8 +34,10 @@ The feature consists of two main components: a backend import service that proce 3. WHEN a document has a new reference THEN the system SHALL mark it as "add" 4. WHEN a document has an existing reference but new/updated files THEN the system SHALL mark it as "update" 5. WHEN a document already exists with no changes THEN the system SHALL mark it as "skip" -6. WHEN a .bib entry has no matching PDF files THEN the system SHALL mark it as "failed" -7. WHEN I review the preview THEN the system SHALL allow me to change the action for individual documents (add/update/skip) +6. WHEN a bibliography entry has no matching PDF files THEN the system SHALL mark it as "no files" but still allow import +7. WHEN I review the preview THEN the system SHALL allow me to change the action for individual documents (add/update/skip/ignore) +8. WHEN I review the preview THEN the system SHALL provide an option to import only references with matched PDFs or import all references including those without PDFs +9. WHEN I select "only with PDFs" THEN the system SHALL automatically set references without PDFs to "ignore" status ### Requirement 3 @@ -70,11 +74,12 @@ The feature consists of two main components: a backend import service that proce #### Acceptance Criteria 1. WHEN .bib file parsing fails THEN the system SHALL provide specific error messages about malformed entries -2. WHEN PDF files are corrupted or unreadable THEN the system SHALL mark them as failed with detailed error information -3. WHEN file uploads fail due to size or network issues THEN the system SHALL provide retry mechanisms -4. WHEN duplicate references exist in the .bib file THEN the system SHALL handle them appropriately and warn the user -5. WHEN the workspace storage quota is exceeded THEN the system SHALL provide clear error messages and stop the import -6. IF the import process is interrupted THEN the system SHALL allow users to resume or restart the import +2. WHEN .csv file parsing fails THEN the system SHALL provide specific error messages about malformed data and allow column selection +3. WHEN PDF files are corrupted or unreadable THEN the system SHALL mark them as failed with detailed error information +4. WHEN file upload jobs fail due to size or network issues THEN the system SHALL provide retry mechanisms +5. WHEN duplicate references exist in the bibliography file THEN the system SHALL handle them appropriately and warn the user +6. WHEN the workspace storage quota is exceeded THEN the system SHALL provide clear error messages and stop the import +7. IF the import process is interrupted THEN the system SHALL allow users to resume or restart the import ### Requirement 6 @@ -87,4 +92,16 @@ The feature consists of two main components: a backend import service that proce 3. WHEN storing files THEN the system SHALL use the existing secure S3 storage infrastructure 4. WHEN parsing .bib files THEN the system SHALL sanitize input to prevent injection attacks 5. WHEN handling file uploads THEN the system SHALL implement proper virus scanning and validation -6. WHEN processing fails THEN the system SHALL clean up temporary files and partial uploads$$ \ No newline at end of file +6. WHEN processing fails THEN the system SHALL clean up temporary files and partial uploads$$ + +### Requirement 7 + +**User Story:** As a researcher, I want the import modal to have proper flow control and not require confirmation to close after successful completion, so that I have a smooth user experience. + +#### Acceptance Criteria + +1. WHEN the import process is in progress THEN the system SHALL require confirmation before allowing me to close the modal +2. WHEN the import process has completed successfully THEN the system SHALL not require confirmation to close the modal +3. WHEN I navigate between steps during the import process THEN the system SHALL preserve my uploaded data +4. WHEN I return to a previous step THEN the system SHALL show my previously uploaded files and selections5. W +HEN I close the import modal after successful completion THEN the system SHALL refresh the recent import list on the home screen \ No newline at end of file diff --git a/.kiro/specs/papers-library-importer/tasks.md b/.kiro/specs/papers-library-importer/tasks.md index 1c6f5690d..da8b121d1 100644 --- a/.kiro/specs/papers-library-importer/tasks.md +++ b/.kiro/specs/papers-library-importer/tasks.md @@ -33,7 +33,7 @@ - Enable easy testing of backend import analysis before building frontend - _Requirements: 1.1, 2.1, 2.2_ -- [ ] 3. Create bulk document upload endpoint +- [x] 3. Create bulk document upload endpoint - [x] 3.1 Implement bulk upload API handler - Create POST /documents/bulk endpoint in documents.py handler - Handle multipart form data with documents_metadata and files @@ -63,7 +63,7 @@ - Remove import history creation from bulk upload (moved to separate endpoint) - _Requirements: 3.2, 3.5, 4.1, 4.6_ -- [ ] 4. Create frontend domain architecture and implement BibTeX parsing +- [x] 4. Create frontend domain architecture and implement BibTeX parsing - [x] 4.0 Create frontend domain entities and use cases - Create ImportAnalysis.ts in ~/v1/domain/entities/import/ with backend API data structures - Create get-import-analysis-use-case.ts in ~/v1/domain/usecases/ for API communication @@ -80,6 +80,15 @@ - Use DataframeData type from ~/v1/domain/entities/import/ImportAnalysis.ts - _Requirements: 1.1, 5.1_ +- [x] 4.1.1 Add CSV parser component with column selection + - Add performant CSV parser library dependency (papaparse or similar) + - Implement CSV file parsing in ImportFileUpload.vue component + - Allow user to select reference column and files column for PDF matching + - Convert CSV entries to generic dataframe format (preserve all columns) + - Add validation for required columns and data types + - Handle CSV parsing errors gracefully with user feedback + - _Requirements: 1.2, 5.2_ + - [x] 4.2 Implement file-to-reference matching logic - Create file matching algorithm based on filepath or filename patterns - Implement exact match, partial match, and fuzzy matching strategies @@ -87,17 +96,25 @@ - Add validation for PDF file types and sizes - _Requirements: 1.3, 1.6_ +- [x] 4.2.1 Enhance PDF matching with maximum prefix path matching + - Implement maximum prefix path matching algorithm for better file association + - Improve matching to handle multiple PDFs per reference correctly + - Add progressive file addition with proper deduplication + - Clean up redundant matching information and improve matching accuracy + - Ensure correct handling when reference matches multiple PDF files + - _Requirements: 1.4, 1.8_ + - [x] 5. Create home page integration and modal workflow - [x] 5.1 Add Import Documents button to home page and modify workspace selection - Add "Import Documents" button above ImportFromHub and ImportFromPython components in pages/index.vue - Style button to match existing import section design - Connect button to open full-page import modal - Modify WorkspacesFilter and WorkspaceSelector components to support single workspace selection instead of multi-select - - Pass selected workspace ID to ImportModal component + - Pass selected workspace ID to ImportFlow component - Update DatasetList.vue to handle single workspace selection and pass workspace ID to import modal - _Requirements: 1.1, 4.3_ -- [x] 5.2 Create ImportModal.vue full-page modal component with workspace context +- [x] 5.2 Create ImportFlow.vue full-page modal component with workspace context - Implement full-page modal using existing base-modal component - Create multi-step workflow with navigation between steps - Add step indicators and progress tracking @@ -106,6 +123,14 @@ - Pass workspace ID to ImportAnalysisTable component - _Requirements: 2.1, 4.3_ +- [x] 5.2.1 Improve modal flow control and closing behavior + - Update ImportFlow.vue to disable confirm-close after successful completion + - Ensure confirm-close is only active during import process, not after completion + - Allow flexible upload order (bibliography first or PDFs first) + - Improve step navigation to preserve data when moving between steps + - Add event emission to refresh recent import list on home screen after modal closes + - _Requirements: 7.1, 7.2, 7.3, 7.4, 7.5_ + - [ ] 6. Implement upload step components - [x] 6.1 Create ImportBibUpload.vue component (Step 1) - Implement .bib file upload with drag-and-drop interface @@ -125,7 +150,7 @@ - Modify WorkspacesFilter.vue to support single workspace selection instead of multi-select - Update WorkspaceSelector.vue to use radio buttons instead of checkboxes for single selection - Update DatasetList.vue to handle single workspace selection and emit workspace ID - - Update useHomeViewModel.ts to track selected workspace ID and pass it to ImportModal + - Update useHomeViewModel.ts to track selected workspace ID and pass it to ImportFlow - Ensure workspace context is maintained and passed to import components - _Requirements: 1.1, 4.3_ @@ -141,15 +166,22 @@ - Create toggle functionality for Add/Update/Skip selection - Send ImportAnalysisRequest to backend and display results using GetImportAnalysisUseCase - Import backend API types from ~/v1/domain/entities/import/ImportAnalysis.ts - - Use useImportAnalysisViewModel for reactive state management and API integration + - Use useImportAnalysisTableViewModel for reactive state management and API integration - Accept workspace ID as prop and pass it to the analysis use case - - Fix workspaceId reference in useImportAnalysisViewModel.ts to properly access workspace from parent component + - Fix workspaceId reference in useImportAnalysisTableViewModel.ts to properly access workspace from parent component - _Requirements: 2.1, 2.2, 2.7_ -- [x] 7.3 Fix ImportModal step navigation and data persistence +- [x] 7.2.1 Add option to import references without PDFs at the ImportAnalysis step + - Add toggle option to import entire table including references without matched PDFs + - Add toggle option to import only references with at least one matched PDF file + - Update table filtering to show/hide references without PDFs based on user selection + - Modify import confirmation logic to respect user's choice about references without PDFs + - _Requirements: 2.6, 2.8, 2.9_ + +- [x] 7.3 Fix ImportFlow step navigation and data persistence - Update ImportFileUpload.vue to accept initialBibData and initialPdfData props - Add initializeWithExistingData() method to restore component state when navigating back - - Update ImportModal.vue to pass existing data to ImportFileUpload component + - Update ImportFlow.vue to pass existing data to ImportFileUpload component - Ensure proper data persistence across step navigation without losing uploaded files - Fix component lifecycle management to show uploaded files when returning to step 0 - _Requirements: 2.1, 2.2, 4.3_ diff --git a/.kiro/steering/structure.md b/.kiro/steering/structure.md index 724a8d1b8..c214b75df 100644 --- a/.kiro/steering/structure.md +++ b/.kiro/steering/structure.md @@ -62,6 +62,11 @@ extralit-frontend/ └── package.json # npm configuration ``` +### Existing Auto-Imported Components + +, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , + + ### Key Frontend Patterns - **Components**: Base components in `components/base/`, feature components in `components/features/` - **Pages**: Nuxt.js file-based routing in `pages/` diff --git a/.kiro/steering/tech.md b/.kiro/steering/tech.md index 7b625a169..1de5b217e 100644 --- a/.kiro/steering/tech.md +++ b/.kiro/steering/tech.md @@ -1,99 +1,130 @@ -# Technology Stack +--- +inclusion: always +--- + +# Technology Stack & Development Guidelines ## Architecture Overview -Extralit is a multi-component system with 5 core components: -- **Python SDK**: Client library (`pip install extralit`) -- **FastAPI Server**: Backend API handling users, storage, and data management -- **Web UI**: Vue.js/Nuxt.js frontend for data visualization and annotation -- **Vector Database**: ElasticSearch or AWS OpenSearch for scalable search -- **Database**: PostgreSQL for application data storage - -## Backend (extralit-server/) -- **Framework**: FastAPI ~0.115.0 -- **Database**: SQLAlchemy 2.0 with PostgreSQL (asyncpg) or SQLite (aiosqlite) -- **Search**: ElasticSearch 8.x or OpenSearch 2.x -- **Background Jobs**: Redis Queue (RQ) with Redis -- **Authentication**: python-jose with JWT tokens, OAuth2 support -- **Build System**: PDM (Python Dependency Management) -- **Python**: >=3.9 - -### Key Dependencies -- Pydantic 2.9 for data validation -- Alembic for database migrations -- Uvicorn for ASGI server -- Typer for CLI interface - -## Frontend (extralit-frontend/) -- **Framework**: Nuxt.js 2.17 (Vue.js 2.7) -- **Component Import**: Nuxt automatically scans the ~/components directory and makes all .vue files -- **Build System**: npm/yarn -- **UI Components**: Custom component library with SCSS -- **State Management**: Pinia + Vuex -- **Testing**: Jest (unit), Playwright (e2e) -- **Node**: >=18.16.1 -- **TypeScript**: Use ` - diff --git a/extralit-frontend/components/features/import/ImportFlow.spec.js b/extralit-frontend/components/features/import/ImportFlow.spec.js new file mode 100644 index 000000000..5945d797c --- /dev/null +++ b/extralit-frontend/components/features/import/ImportFlow.spec.js @@ -0,0 +1,157 @@ +import { mount } from "@vue/test-utils"; +import ImportFlow from "./ImportFlow.vue"; + +// Mock dependencies +jest.mock("@nuxtjs/composition-api", () => ({ + ref: jest.fn(), + watch: jest.fn(), +})); + +describe("ImportFlow", () => { + let wrapper; + + const mockWorkspace = { + id: "workspace-1", + name: "Test Workspace", + }; + + const mockDataframeData = { + schema: { + fields: [ + { name: "reference", type: "string" }, + { name: "title", type: "string" }, + ], + primaryKey: ["reference"], + }, + data: [ + { reference: "test1", title: "Test Paper 1", filePaths: ["test1.pdf"] }, + { reference: "test2", title: "Test Paper 2", filePaths: [] }, + ], + }; + + const mockFilteredDataframeData = { + schema: { + fields: [ + { name: "reference", type: "string" }, + { name: "title", type: "string" }, + ], + primaryKey: ["reference"], + }, + data: [{ reference: "test1", title: "Test Paper 1", filePaths: ["test1.pdf"] }], + }; + + beforeEach(() => { + jest.clearAllMocks(); + + const compositionApi = require("@nuxtjs/composition-api"); + compositionApi.ref.mockImplementation((initialValue) => ({ + value: initialValue, + })); + compositionApi.watch.mockImplementation(() => {}); + + wrapper = mount(ImportFlow, { + propsData: { + isVisible: true, + workspace: mockWorkspace, + }, + stubs: { + BaseFlowModal: { + template: '
', + props: ["visible", "title", "steps", "currentStep"], + }, + ImportFileUpload: true, + ImportAnalysisTable: true, + ImportBatchProgress: true, + ImportSummary: true, + }, + mocks: { + $t: (key, params) => `${key}${params ? JSON.stringify(params) : ""}`, + }, + }); + + // Set initial bibData + wrapper.setData({ + bibData: { + fileName: "test.bib", + parsedEntries: [], + dataframeData: mockDataframeData, + rawContent: "", + }, + }); + }); + + afterEach(() => { + if (wrapper) { + wrapper.destroy(); + } + jest.restoreAllMocks(); + }); + + describe("Analysis Update Handling", () => { + it("should update confirmed documents from analysis update", () => { + const mockAnalysisData = { + confirmedDocuments: { test1: { document_create: {} } }, + totalConfirmed: 1, + documentActions: {}, + importMode: "all", + filteredDataframeData: mockDataframeData, + }; + + wrapper.vm.handleAnalysisUpdate(mockAnalysisData); + + expect(wrapper.vm.uploadData.confirmedDocuments).toEqual(mockAnalysisData.confirmedDocuments); + expect(wrapper.vm.bibData.dataframeData).toEqual(mockDataframeData); + }); + + it("should update bibData with filtered dataframe data when provided", () => { + const mockAnalysisData = { + confirmedDocuments: { test1: { document_create: {} } }, + totalConfirmed: 1, + documentActions: {}, + importMode: "with-pdfs", + filteredDataframeData: mockFilteredDataframeData, + }; + + wrapper.vm.handleAnalysisUpdate(mockAnalysisData); + + expect(wrapper.vm.uploadData.confirmedDocuments).toEqual(mockAnalysisData.confirmedDocuments); + expect(wrapper.vm.bibData.dataframeData).toEqual(mockFilteredDataframeData); + expect(wrapper.vm.bibData.dataframeData.data.length).toBe(1); + expect(wrapper.vm.bibData.dataframeData.data[0].reference).toBe("test1"); + }); + + it("should not update bibData.dataframeData if filteredDataframeData is not provided", () => { + const originalDataframeData = wrapper.vm.bibData.dataframeData; + + const mockAnalysisData = { + confirmedDocuments: { test1: { document_create: {} } }, + totalConfirmed: 1, + documentActions: {}, + importMode: "all", + // No filteredDataframeData provided + }; + + wrapper.vm.handleAnalysisUpdate(mockAnalysisData); + + expect(wrapper.vm.uploadData.confirmedDocuments).toEqual(mockAnalysisData.confirmedDocuments); + expect(wrapper.vm.bibData.dataframeData).toEqual(originalDataframeData); + }); + }); + + describe("Data Flow", () => { + it("should pass the correct dataframe data to ImportBatchProgress component", () => { + // Set up the component to be on step 2 (ImportBatchProgress) + wrapper.setData({ + currentStep: 2, + bibData: { + ...wrapper.vm.bibData, + dataframeData: mockFilteredDataframeData, + }, + }); + + // The ImportBatchProgress component should receive the filtered dataframe data + // This would be verified in integration tests, but we can check that the data is properly stored + expect(wrapper.vm.bibData.dataframeData).toEqual(mockFilteredDataframeData); + }); + }); +}); diff --git a/extralit-frontend/components/features/import/ImportModal.vue b/extralit-frontend/components/features/import/ImportFlow.vue similarity index 88% rename from extralit-frontend/components/features/import/ImportModal.vue rename to extralit-frontend/components/features/import/ImportFlow.vue index 0220ea56e..66f7d5a8b 100644 --- a/extralit-frontend/components/features/import/ImportModal.vue +++ b/extralit-frontend/components/features/import/ImportFlow.vue @@ -1,7 +1,7 @@