Skip to content

Hewlbern/procedure-conversion

Repository files navigation

SAS to Snowflake SQL Converter

A Python CLI tool that converts SAS code files to Snowflake SQL procedures using Google Gemini 2.0 Flash API.

Features

  • Convert individual SAS files or entire folders of SAS files
  • Uses Google Gemini 2.0 Flash for intelligent code conversion
  • Advanced prompt engineering with multiple conversion strategies
  • Retry logic with different prompt types for complex conversions
  • Quality evaluation system using RAG metrics (Context Relevance, Answer Relevance, Groundedness)
  • Wikidata SQL reference integration for groundedness evaluation
  • Handles various SAS procedures and data steps
  • Generates Snowflake-optimized SQL with proper error handling
  • Verbose logging for debugging and monitoring
  • Command-line interface with argument validation
  • Modular architecture with separate config, prompt, and evaluation modules

Installation

  1. Clone this repository

  2. Install dependencies:

    pip install -r requirements.txt
  3. Set your Google Gemini API key using one of these methods:

    Option A: Environment variable

    export GEMINI_API_KEY='your-api-key-here'

    Option B: .env file (recommended)

    cp env.example .env
    # Edit .env file and add your API key

Usage

Test API Connection First

# Test your API key and connection
python cli.py test --verbose

# Or use the dedicated test script
python test_connection.py

Convert a Single File

python cli.py file input.sas output.sql

Convert with Quality Evaluation

# Convert with comprehensive quality evaluation
python cli.py file input.sas output.sql --evaluate

# This generates:
# - output.sql (converted SQL)
# - output_evaluation.md (human-readable report)
# - output_evaluation.json (detailed metrics)

Convert All Files in a Folder

python cli.py folder ./sas_files ./sql_output

Enable Verbose Logging

python cli.py file input.sas output.sql --verbose

Get Help

python cli.py --help
python cli.py file --help
python cli.py folder --help
python cli.py test --help

Troubleshooting

Common Issues

1. API Key Issues

2. 400 Bad Request Errors

  • Run python cli.py test --verbose to see detailed error information
  • Check your internet connection
  • Verify your API key has sufficient quota

3. File Not Found Errors

  • Ensure input files have .sas extension
  • Check file paths are correct
  • Verify file permissions

Debug Mode

Use the --verbose flag to see detailed logging:

python cli.py test --verbose
python cli.py file input.sas output.sql --verbose

Quality Evaluation

The converter includes a comprehensive quality evaluation system based on RAG (Retrieval-Augmented Generation) metrics:

Evaluation Metrics

1. Context Relevance (0.0-1.0)

  • Data Operations Coverage: How well all SAS data operations are covered
  • Variable Mapping: Accuracy of SAS variable to SQL column mapping
  • Logic Preservation: Preservation of business logic from SAS to SQL
  • Transformation Completeness: Completeness of data transformations

2. Answer Relevance (0.0-1.0)

  • Functional Equivalence: Whether SQL produces same results as SAS
  • Output Structure: Correctness of table structure and output format
  • Business Requirements: Fulfillment of business requirements
  • Completeness: Coverage of all SAS code requirements

3. Groundedness (0.0-1.0)

  • SQL Syntax Compliance: Adherence to standard SQL syntax
  • Snowflake Optimization: Use of Snowflake-specific features
  • Performance Considerations: Performance best practices
  • Error Prevention: Handling of edge cases and errors

Quality Levels

  • EXCELLENT (≥0.9): Ready for production use
  • GOOD (≥0.8): Minor improvements may be needed
  • ACCEPTABLE (≥0.7): Some improvements recommended
  • NEEDS_IMPROVEMENT (≥0.6): Significant improvements needed
  • POOR (<0.6): Major issues need to be addressed

SQL Reference

The groundedness evaluation uses the Wikidata SQL standard (Q47607) as a reference for SQL best practices and features.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published