Skip to content

fix(dav): Strip DTSTAMP before etag comparison in webcal refresh#58193

Open
Olen wants to merge 1 commit intonextcloud:masterfrom
Olen:fix/strip-dtstamp-webcal-refresh
Open

fix(dav): Strip DTSTAMP before etag comparison in webcal refresh#58193
Olen wants to merge 1 commit intonextcloud:masterfrom
Olen:fix/strip-dtstamp-webcal-refresh

Conversation

@Olen
Copy link

@Olen Olen commented Feb 9, 2026

Summary

  • Exclude DTSTAMP from etag computation in RefreshWebcalService to prevent false change detection
  • DTSTAMP is preserved in stored calendar data (required property per RFC 5545)

Problem

Many iCal providers (Google Calendar, Outlook 365, itslearning) set DTSTAMP to the current UTC time on every feed request per RFC 5545. Since Nextcloud computes the etag as md5(serialized_data) and DTSTAMP changes every time, every event appears modified on every refresh — even if nothing actually changed.

For a Google Calendar subscription with 2,621 events refreshed every ~15 minutes, this generates ~189K false change rows per day in oc_calendarchanges.

Reported at: #51120 (comment)

Approach

Instead of stripping DTSTAMP from the VObject (which would remove a required property from stored data), the fix strips DTSTAMP lines from a copy of the serialized string used only for etag computation:

$sObjectForEtag = preg_replace('/^DTSTAMP:.*\r?\n/m', '', $sObject);
$etag = md5($sObjectForEtag);

The stored calendar data ($sObject) is completely unchanged.

Test plan

  • New test testDtstampChangeDoesNotTriggerUpdate verifies a DTSTAMP-only change does not trigger updateCalendarObject
  • Updated identicalDataProvider etag computation to match the new DTSTAMP-stripped hashing
  • Run RefreshWebcalServiceTest unit tests

Fixes: #51120

🤖 Generated with Claude Code

@Olen Olen force-pushed the fix/strip-dtstamp-webcal-refresh branch 2 times, most recently from 7df84bc to 4c5b46f Compare February 9, 2026 11:38
@tcitworld
Copy link
Member

As the DTSTAMP is a required property, we shouldn't remove it from the saved payload, but just do the md5 hash without it.

@Olen Olen force-pushed the fix/strip-dtstamp-webcal-refresh branch from 4c5b46f to 36fdb9a Compare February 9, 2026 17:25
@Olen
Copy link
Author

Olen commented Feb 9, 2026

Thanks for the review @tcitworld — you're absolutely right. DTSTAMP is a required property per RFC 5545 and shouldn't be removed from the stored data.

Apologies for being too quick with the initial approach. I've reworked the fix to be much more minimal: instead of stripping DTSTAMP from the VObject before serialization, it now only strips the DTSTAMP lines from a copy of the serialized string used for the etag/md5 computation. The stored calendar data is completely unchanged — DTSTAMP is preserved.

The diff is now just one line replaced in RefreshWebcalService.php:

-$etag = md5($sObject);
+$sObjectForEtag = preg_replace('/^DTSTAMP:.*\r?\n/m', '', $sObject);
+$etag = md5($sObjectForEtag);

@Olen Olen force-pushed the fix/strip-dtstamp-webcal-refresh branch from 36fdb9a to 6d10452 Compare February 9, 2026 17:27
@Olen Olen force-pushed the fix/strip-dtstamp-webcal-refresh branch 2 times, most recently from 9d58868 to 6487bb3 Compare February 9, 2026 19:56
@Olen
Copy link
Author

Olen commented Feb 9, 2026

Good point @tcitworld — updated the regex to handle both parameters (DTSTAMP;TZID=...) and RFC 5545 content line folding:

'/^DTSTAMP[;:].*\r?\n([ \t].*\r?\n)*/m'

This matches DTSTAMP starting with either ; (parameters) or : (value), plus any continuation lines starting with a space or tab.

Also added a new test testFoldedDtstampChangeDoesNotTriggerUpdate that uses a DTSTAMP with a long parameter value (X-VOBJ-ORIGINAL-TZID=America/Argentina/Buenos_Aires) exceeding 75 bytes to verify the regex handles folded lines correctly.

@Olen Olen force-pushed the fix/strip-dtstamp-webcal-refresh branch 3 times, most recently from acbac30 to 496095a Compare February 9, 2026 20:04
Copy link
Contributor

@kesselb kesselb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your pr 👍

// to the current time on every feed request per RFC 5545, causing
// every event to appear modified on every refresh.
// DTSTAMP is kept in the stored data as it is a required property.
$sObjectForEtag = preg_replace('/^DTSTAMP[;:].*\r?\n([ \t].*\r?\n)*/m', '', $sObject);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a bit odd. I think we should rather use the data from $vObject to calculate an etag instead of altering the serialized object with regex.

I'm also seeing the changing dtstamp with my google test account:

Request 1

BEGIN:VEVENT
DTSTART:20250905T163000Z
DTEND:20250905T173000Z
DTSTAMP:20260210T144443Z
UID:1234-1234-1234-1234
CREATED:20250904T204927Z
LAST-MODIFIED:20250904T204927Z
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Testing 3
TRANSP:OPAQUE
END:VEVENT

Request 2 (a few seconds later)

BEGIN:VEVENT
DTSTART:20250905T163000Z
DTEND:20250905T173000Z
DTSTAMP:20260210T144446Z
UID:1234-1234-1234-1234
CREATED:20250904T204927Z
LAST-MODIFIED:20250904T204927Z
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Testing 3
TRANSP:OPAQUE
END:VEVENT

What do you think to calculate the etag from "UID + Last-Modified + Sequence + DTSTART + DTSEND" @tcitworld @SebastianKrupinski?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kesselb

I think that approach would work... as long as the source updates those fields properly... which is a big IF...

My idea would be to just serialize the object twice... once with the DTSTAMP once with out... But you idea is more efficient

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just let me know if you decide you want to build the etag differently. I will update the PR to match what you think is best. I only noticed because the table was growing huge with updates for some of my calendars.

@kesselb kesselb added the 2. developing Work in progress label Feb 10, 2026
@kesselb kesselb added this to the Nextcloud 34 milestone Feb 10, 2026
@kesselb kesselb added the bug label Feb 10, 2026
@Olen Olen force-pushed the fix/strip-dtstamp-webcal-refresh branch from 496095a to 153f75a Compare February 10, 2026 18:54
Many iCal providers (Google Calendar, Outlook, itslearning) set DTSTAMP
to the current time on every feed request per RFC 5545, causing every
event to appear modified on every subscription refresh.

Strip DTSTAMP lines from a copy of the serialized data used for etag
computation only. The stored calendar data is unchanged — DTSTAMP is
preserved as it is a required property per RFC 5545.

The regex handles DTSTAMP with parameters (DTSTAMP;TZID=...) and
RFC 5545 content line folding where long lines are split with CRLF
followed by a space or tab.

Signed-off-by: Olen <regopa@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: olen <ola@nytt.no>
@Olen Olen force-pushed the fix/strip-dtstamp-webcal-refresh branch from 153f75a to 6e9ffc6 Compare February 11, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2. developing Work in progress bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Optimize Deletion of Objects for Calendar Subscriptions

4 participants