Fix and unskip some skills E2E tests#69
Merged
SteveSandersonMS merged 7 commits intomainfrom Jan 21, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes the skills E2E tests to enable snapshot sharing across test runs and unskips the tests that work correctly. The changes address flakiness issues caused by unique directory names per test run, which prevented snapshot reuse.
Changes:
- Added cleanup logic to ensure each test starts with a fresh skills directory
- Standardized skill directory creation to use a single
.test_skillsdirectory instead of unique per-run directories - Fixed line ending issues in Python and .NET for cross-platform compatibility
- Unskipped two working tests: "load and apply skill from skillDirectories" and "not apply skill when disabled via disabledSkills"
- Added comprehensive comments explaining why the session resume test remains skipped
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
test/snapshots/skills/should_not_apply_skill_when_disabled_via_disabledskills.yaml |
Added snapshot showing skill disabled behavior (no marker in response) |
test/snapshots/skills/should_load_and_apply_skill_from_skilldirectories.yaml |
Added snapshot showing successful skill loading and application |
python/e2e/test_skills.py |
Added cleanup fixture, standardized directory names, fixed line endings, unskipped working tests |
nodejs/test/e2e/skills.test.ts |
Added beforeEach cleanup, standardized directory names, unskipped suite with detailed skip comment for problematic test |
go/e2e/skills_test.go |
Added cleanup function, standardized directory names, renamed test function, unskipped working tests |
dotnet/test/SkillsTests.cs |
Added constructor cleanup, standardized directory names, fixed line endings, unskipped working tests |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
friggeri
approved these changes
Jan 21, 2026
This was referenced Jan 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the skills E2E tests to share snapshots and unskips the ones of them that work.
Previously we were seeing some very strange behavior whereby tests would pass or fail depending on which order they run in, and I've tracked it down to a pretty solid belief there's a bug in the underlying "resume with skills" feature. Perhaps two bugs.
The "should apply skill on session resume with skillDirectories" will fail if it runs on its own, but passes if it's in the same run as one of the other tests
If you do unskip "should apply skill on session resume with skillDirectories" and run it in the same run before "should not apply skill when disabled via disabledSkills", then the disabledSkills test will fail.
There's no .NET-specific bug that I'm aware of. The only issues with that were the ones identified yesterday (the inability for snapshots to be replayed because of mismatches in paths and line endings, both of which are fixed now).