Implement data loading and rotation augmentation integration #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PIPELINE_COVERAGE.md mentioned data loading and rotation needed improvements but lacked specifics on what was missing or how to implement fixes.
This PR provides the complete specification AND full implementation of the data loading and rotation augmentation features.
Changes
Phase 1: Specification (Original)
New: DATA_LOADING_ROTATION_IMPROVEMENTS.md (496 lines)
Comprehensive specification detailing:
Data Loading Issues:
DataGeneratorclass wrapping DATASET.pyModelTrainer.__init__()to acceptDataGeneratorlineament-traincommandRotation Augmentation Issues:
RotationAugmentationKeras layerAugmentationConfigdataclassbuild_model()to apply augmentationUpdated: PIPELINE_COVERAGE.md, README.md, CONTRIBUTING.md
Phase 2: Implementation (New)
Implemented: DataGenerator Class (data_generator.py - 219 lines)
create_training_dataset()andcreate_validation_dataset()methodsget_dataset_info()Implemented: RotationAugmentation Layer (model_modern.py)
Implemented: AugmentationConfig (config.py)
Enhanced: build_model() Function (model_modern.py)
Enhanced: ModelTrainer Class (model_modern.py)
data_generatorparameter in constructortrain()method acceptsdata_pathfor automatic DataGenerator creationEnhanced: CLI (cli.py)
--enable-rotation,--rotation-prob,--enable-flippingoptions--train-ratio,--val-ratio,--choosyoptionsNew: Documentation and Examples
Usage
Command Line
python cli.py train --data dataset.mat --output ./models \ --enable-rotation --rotation-prob 0.5 \ --enable-flipping --tensorboardPython API
Configuration File
{ "augmentation": { "enable_rotation": true, "rotation_probability": 0.5, "rotation_angles": [0, 90, 180, 270], "enable_flipping": true } }Key Features
✅ Automatic data loading from .mat files without manual code
✅ Efficient tf.data.Dataset pipelines with prefetching and batching
✅ Graph-compatible rotation augmentation using TensorFlow operations
✅ Flipping augmentation for additional data variety
✅ Configuration-based via JSON files or CLI options
✅ 100% backward compatible - all existing code still works
✅ Production-ready - code reviewed and refined
Files Changed
Testing
Backward Compatibility
All existing usage patterns continue to work:
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.