A Python CLI tool that converts SAS code files to Snowflake SQL procedures using Google Gemini 2.0 Flash API.
- Convert individual SAS files or entire folders of SAS files
- Uses Google Gemini 2.0 Flash for intelligent code conversion
- Advanced prompt engineering with multiple conversion strategies
- Retry logic with different prompt types for complex conversions
- Quality evaluation system using RAG metrics (Context Relevance, Answer Relevance, Groundedness)
- Wikidata SQL reference integration for groundedness evaluation
- Handles various SAS procedures and data steps
- Generates Snowflake-optimized SQL with proper error handling
- Verbose logging for debugging and monitoring
- Command-line interface with argument validation
- Modular architecture with separate config, prompt, and evaluation modules
-
Clone this repository
-
Install dependencies:
pip install -r requirements.txt
-
Set your Google Gemini API key using one of these methods:
Option A: Environment variable
export GEMINI_API_KEY='your-api-key-here'
Option B: .env file (recommended)
cp env.example .env # Edit .env file and add your API key
# Test your API key and connection
python cli.py test --verbose
# Or use the dedicated test script
python test_connection.pypython cli.py file input.sas output.sql# Convert with comprehensive quality evaluation
python cli.py file input.sas output.sql --evaluate
# This generates:
# - output.sql (converted SQL)
# - output_evaluation.md (human-readable report)
# - output_evaluation.json (detailed metrics)python cli.py folder ./sas_files ./sql_outputpython cli.py file input.sas output.sql --verbosepython cli.py --help
python cli.py file --help
python cli.py folder --help
python cli.py test --help1. API Key Issues
- Make sure your API key starts with 'AI'
- Verify the key is valid at https://makersuite.google.com/app/apikey
- Check that the key is properly set in your .env file
2. 400 Bad Request Errors
- Run
python cli.py test --verboseto see detailed error information - Check your internet connection
- Verify your API key has sufficient quota
3. File Not Found Errors
- Ensure input files have .sas extension
- Check file paths are correct
- Verify file permissions
Use the --verbose flag to see detailed logging:
python cli.py test --verbose
python cli.py file input.sas output.sql --verboseThe converter includes a comprehensive quality evaluation system based on RAG (Retrieval-Augmented Generation) metrics:
1. Context Relevance (0.0-1.0)
- Data Operations Coverage: How well all SAS data operations are covered
- Variable Mapping: Accuracy of SAS variable to SQL column mapping
- Logic Preservation: Preservation of business logic from SAS to SQL
- Transformation Completeness: Completeness of data transformations
2. Answer Relevance (0.0-1.0)
- Functional Equivalence: Whether SQL produces same results as SAS
- Output Structure: Correctness of table structure and output format
- Business Requirements: Fulfillment of business requirements
- Completeness: Coverage of all SAS code requirements
3. Groundedness (0.0-1.0)
- SQL Syntax Compliance: Adherence to standard SQL syntax
- Snowflake Optimization: Use of Snowflake-specific features
- Performance Considerations: Performance best practices
- Error Prevention: Handling of edge cases and errors
- EXCELLENT (≥0.9): Ready for production use
- GOOD (≥0.8): Minor improvements may be needed
- ACCEPTABLE (≥0.7): Some improvements recommended
- NEEDS_IMPROVEMENT (≥0.6): Significant improvements needed
- POOR (<0.6): Major issues need to be addressed
The groundedness evaluation uses the Wikidata SQL standard (Q47607) as a reference for SQL best practices and features.