Skip to content

[Refactor] change python namespace #110

@dawn-tran

Description

@dawn-tran

Summary

Change import argilla to import extralit and import argilla-server to import extralit-server across the entire codebase to make the Python SDK namespace coherent with the Extralit brand.

Motivation

Make the Python SDK more coherent by aligning all import statements and package references with the Extralit brand identity. This will:

  • Remove confusion between the original Argilla project and Extralit
  • Create a consistent developer experience across all components
  • Establish clear separation from the upstream Argilla project
  • Improve brand recognition and consistency

Proposed Refactor

This is a comprehensive codebase-wide refactor that needs to be executed systematically across multiple components:

1. Package Structure Changes

Current Structure:

  • extralit/src/argilla/ → Should become extralit/src/extralit/
  • argilla-server/src/argilla_server/ → Should become extralit-server/src/extralit_server/

Steps:

  1. Rename the source directory structure:
    # In extralit/
    mv src/argilla src/extralit
    
    # In argilla-server/
    mv src/argilla_server src/extralit_server

2. Python Import Statement Updates

Scope Analysis:

  • ~500+ import statements need to be updated across the codebase
  • Main patterns to replace:
    • import argillaimport extralit
    • from argillafrom extralit
    • import argilla_serverimport extralit_server
    • from argilla_serverfrom extralit_server
    • Internal relative imports within modules

Files requiring updates:

  • All Python files in extralit/src/
  • All Python files in argilla-server/src/
  • All Python files in argilla-v1/src/
  • Test files in both packages
  • Example files in examples/
  • Configuration and setup files

3. Configuration File Updates

pyproject.toml files:

  • extralit/pyproject.toml: Update package name, scripts, and version paths
  • argilla-server/pyproject.toml: Update package name and CLI scripts
  • Update PDM script references and module paths

Docker and Infrastructure:

  • Docker compose files in .devcontainer/
  • Dockerfile references
  • Database connection strings and environment variables
  • GitHub Actions workflows

4. Documentation and Metadata Updates

Files to update:

  • .github/copilot-instructions.md: Update all references
  • README files across all packages
  • Issue templates in .github/ISSUE_TEMPLATE/
  • Development container configurations
  • Code documentation and docstrings

Recommended Tools and Approach

Phase 1: Automated Search & Replace

Use these tools for bulk replacements:

  1. ripgrep + sed for Python files:
# Replace import statements
rg -l "import argilla" --type py | xargs sed -i 's/import argilla/import extralit/g'
rg -l "from argilla" --type py | xargs sed -i 's/from argilla/from extralit/g'
rg -l "import argilla_server" --type py | xargs sed -i 's/import argilla_server/import extralit_server/g'
rg -l "from argilla_server" --type py | xargs sed -i 's/from argilla_server/from extralit_server/g'
  1. VS Code Global Search & Replace:

    • Use regex mode for pattern matching
    • Target file types: *.py,*.toml,*.yaml,*.yml,*.json,*.md
    • Patterns:
      • import argilla([^_])import extralit$1
      • from argilla([^_])from extralit$1
      • argilla_serverextralit_server
  2. Python AST-based refactoring tools:

    • libcst or refactor for more sophisticated Python code transformations
    • Can handle complex import scenarios and maintain proper formatting

Phase 2: Manual Verification & Edge Cases

  1. String literals and configuration:

    • Database URLs: ARGILLA_DATABASE_URLEXTRALIT_DATABASE_URL
    • Logger names: logging.getLogger("argilla")logging.getLogger("extralit")
    • Package references in setup scripts
  2. Complex import scenarios:

    • Conditional imports
    • Dynamic imports using importlib
    • String-based module references

Phase 3: Directory Structure Migration

# Create new directory structure
mkdir -p extralit-server/src/extralit_server
mkdir -p extralit/src/extralit

# Move source code (after import updates)
rsync -av argilla-server/src/argilla_server/ extralit-server/src/extralit_server/
rsync -av extralit/src/argilla/ extralit/src/extralit/

# Clean up old directories
rm -rf argilla-server/src/argilla_server
rm -rf extralit/src/argilla

Acceptance Criteria

Functional Requirements:

  • All Python import statements use extralit/extralit_server namespaces
  • Package names in pyproject.toml files are updated
  • CLI scripts and entry points use new package names
  • All internal module references are consistent
  • Docker and development environment configurations updated

Quality Assurance:

  • No licensing issues with Apache-2.0 license
  • All pre-commit hooks pass
  • Core functionality tests pass: pdm run test tests/unit/services/test_schemas.py -v
  • Major integration tests pass
  • No broken imports or missing module errors
  • Documentation is updated and consistent

Development Workflow:

  • Development containers work with new package structure
  • Build processes (Docker, PDM) work correctly
  • VS Code configurations and search exclusions updated
  • Database migrations work with new module paths

Risk Assessment & Mitigation

High Risk Items:

  1. Breaking changes for existing users - needs major version bump
  2. Database migration paths may break if module names are hardcoded
  3. Third-party integrations expecting argilla import
  4. Complex import dependencies between packages

Mitigation Strategies:

  1. Comprehensive test coverage before and after changes
  2. Staged rollout: Complete backend first, then frontend, then examples
  3. Import compatibility layer temporarily during transition
  4. Detailed migration documentation for users

Implementation Plan

  1. Week 1: Automated search & replace for Python imports
  2. Week 1: Update configuration files and build systems
  3. Week 2: Directory structure migration and path updates
  4. Week 2: Manual verification and edge case fixes
  5. Week 3: Testing, documentation, and final validation
  6. Week 3: Integration testing and deployment verification

Additional Context

This refactor touches the core identity of the codebase. All argilla-specific code was originally forked from argilla-io/argilla repo, though many functions have been heavily overwritten and upgraded. The refactoring will establish Extralit as a distinct project while maintaining all existing functionality.

Related Files:

  • Core package configs: extralit/pyproject.toml, argilla-server/pyproject.toml
  • Version files: extralit/src/argilla/_version.py, argilla-server/src/argilla_server/_version.py
  • Main application entry points: argilla-server/src/argilla_server/_app.py
  • Development environment: .devcontainer/, docker-compose.yaml

Metadata

Metadata

Assignees

Labels

refactorCode refactoring or technical debt improvements

Type

No fields configured for Task.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions