Skip to content

feat: hash PII (email/SMS) in SharedPreferences at rest#2614

Merged
sherwinski merged 9 commits into5.7-mainfrom
sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml
Apr 21, 2026
Merged

feat: hash PII (email/SMS) in SharedPreferences at rest#2614
sherwinski merged 9 commits into5.7-mainfrom
sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml

Conversation

@sherwinski
Copy link
Copy Markdown
Contributor

@sherwinski sherwinski commented Apr 15, 2026

Description

One Line Summary

Hash email and SMS subscription addresses (SHA-256) before writing to SharedPreferences so PII is not stored in plain text on device.

Details

Motivation

Email addresses and phone numbers are stored in plain text in OneSignal.xml under the MODEL_STORE_subscriptions key. On rooted devices or via ADB backup, this PII is directly readable. This PR hashes those values at the serialization boundary so the on-disk representation is opaque.

Scope

  • Email and SMS subscription addresses only. Push tokens are not hashed — they are not PII and are required for push delivery.
  • On-disk only. The in-memory SubscriptionModel.address always holds the raw value at runtime (after server hydration). Network requests continue to send the real email/phone to the backend API.
  • No public API changes. The IEmailSubscription.email and ISmsSubscription.number getters return the raw value post-hydration, or an empty string during the brief cold-start window before RefreshUserOperationExecutor restores the real value.

How it works

  1. PIIHasher — new utility: deterministic SHA-256 hashing and isHashed() detection (64-char lowercase hex).
  2. ModelStore.transformJsonForPersistence() — new protected hook called during persist(). Default is identity; subclasses can override to transform JSON before it hits SharedPreferences.
  3. SubscriptionModelStore — overrides the hook to hash the address field for EMAIL and SMS subscriptions. Skips PUSH. Skips already-hashed values (idempotency guard against double-hashing on restart).
  4. Hash-aware removal/lookupSubscriptionManager.removeEmailSubscription(), removeSmsSubscription(), SubscriptionList.getByEmail(), and getBySMS() now compare the raw input against both the raw and hashed model.address, so lookups work whether the model is pre- or post-hydration.
  5. Log redactionSubscriptionManager.addSubscriptionToModels() debug log now prints the hash instead of the raw address.
  6. Public getter safety — EmailSubscription.email and SmsSubscription.number return "" if the underlying address is detected as a hash (cold-start to hydration window).

Trade-off

Between cold start and server hydration (~0.5–1s), model.address contains a hash loaded from SharedPreferences. During this window, public getters return an empty string. Once RefreshUserOperationExecutor hydrates the model from the server response, the real value is restored and persisted as a new hash.

Note: The IEmailSubscription.email and ISmsSubscription.number getters exist but are not reachable through the current public API (IUserManager does not expose email/SMS subscription collections). If a future API exposes them, they will return an empty string during the brief cold-start to hydration window.

Testing

Unit testing

All 560 existing unit tests pass. The ModelingTests deadlock test exercised the new transformJsonForPersistence path — an initial NPE (accessing model.type on an uninitialized model) was caught and fixed by reading the type from the JSON object instead.

New tests were added for the following:

  • PIIHasherTests: hash output format, determinism, known digest,
    isHashed detection for valid/invalid inputs
  • SubscriptionModelStoreTests: persist hashes email/SMS but not push,
    idempotent on already-hashed values, in-memory model stays raw
  • SubscriptionManagerTests: hash-aware removal for email/SMS when
    model.address is hashed (pre-hydration), public getter returns
    empty string for hashed addresses, getByEmail/getBySMS find
    subscriptions with hashed addresses

Manual testing

Tested on a Pixel emulator (API 35) with the OneSignal demo app:

  1. Fresh email/SMS add — Added email and SMS via demo app. Dumped SharedPreferences → addresses are 64-char SHA-256 hashes. Push token is unchanged (not hashed).
  2. Network requests — Verified via logcat that API requests to OneSignal backend contain the raw email/phone, not hashes.
  3. Post-hydration removal — Removed email after hydration completed → subscription removed from SharedPreferences successfully.
  4. Pre-hydration removal — Added removeEmail() call immediately after initWithContext (before RefreshUserOperationExecutor runs). The hash-aware comparison matched the raw input against the hashed stored value → subscription removed correctly.
  5. Upgrade migration (plain text → hashed) — Installed 5.7-main build, added email (stored as plain text). Installed feature branch build over the top (no uninstall). On first launch, the plain-text address was automatically hashed during the first persist cycle.
  6. Removal after migrationremoveEmail() with the raw email successfully removed the now-hashed subscription.

Affected code checklist

  • Notifications
    • Display
    • Open
    • Push Processing
    • Confirm Deliveries
  • Outcomes
  • Sessions
  • In-App Messaging
  • REST API requests
  • Public API changes

Checklist

Overview

  • I have filled out all REQUIRED sections above
  • PR does one thing
  • Any Public API changes are explained in the PR details and conform to existing APIs

Testing

  • I have included test coverage for these changes, or explained why they are not needed
  • All automated tests pass, or I explained why that is not possible
  • I have personally tested this on my device, or explained why that is not possible

Final pass

  • Code is as readable as possible.
  • I have reviewed this PR myself, ensuring it meets each checklist item

@sherwinski sherwinski force-pushed the sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml branch from 5c2841e to 322b181 Compare April 15, 2026 17:28
get() = model.address
get() {
val address = model.address
return if (PIIHasher.isHashed(address)) "" else address
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why return empty, would that affect getBySMS check?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getBySMS reads model.address directly (not the public getter), so there's no impact. The empty return is to protect against leaking a hash to app developers during the cold-start-to-hydration window. I opted for this approach to signal to devs that the value isn't available yet as opposed to providing a confusing value.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this can we pass an Enum back.
Something like

Hash("value")
Address("value")
Unknown

then its more clear and explicit

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, but I'd want to clarify one thing first.

ISmsSubscription and IEmailSubscription expose public getters for email and sms, so passing an enum back would technically be a breaking change. The wrinkle here is that the getters are not reachable because IUserManager doesn't expose them, although I believe we do access those values directly in the demo app.
Given this is the case, would we still be ok with that approach?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, these Interface are exposed in the IUserManager - would cause a breaking change. I think we should be with this although its might not super clear IMHO for a developer.

Can we update documentation that states -
Before hydration

  • if null then this is what it means

After hydration

  • if null then this is what it means

@sherwinski
Copy link
Copy Markdown
Contributor Author

Manual Testing Steps

Setup

# Build and install the demo app (from OneSignalSDK/ directory)
./gradlew :app:installGmsDebug

# Useful commands for inspecting state
adb shell run-as com.onesignal.sdktest cat /data/data/com.onesignal.sdktest/shared_prefs/OneSignal.xml | rg address
adb shell run-as com.onesignal.sdktest cat /data/data/com.onesignal.sdktest/shared_prefs/OneSignal.xml | rg EMAIL
adb logcat -s "OneSignal" | grep -E "SDK-4202|addSubscription|RefreshUserOperationExecutor|Request Sent"

# Force-kill the app
adb shell am force-stop com.onesignal.sdktest

# Relaunch the app
adb shell am start -n com.onesignal.sdktest/.ui.main.MainActivity

Core Hashing

  • Email address is hashed in SharedPreferences
    Added test@example.com via demo app. Dumped SharedPreferences:

    "address":"973dfe463ec85785f5f95af5ba3906eedb2d931c24e69824a89ea65dba4e813b"
    

    64-char SHA-256 hex hash confirmed.

  • Push token is not hashed
    Same dump shows push token stored as raw FCM token:

    "type":"PUSH","address":"dz1A0qydQGCYM9dDgo6rB_:APA91bEq..."
    
  • Debug log shows hash, not raw email
    Logcat output for addSubscription shows the hashed address, not the plain-text email.

  • Network call sends raw email to backend
    Logcat shows the POST/PATCH request body contains "token":"test@example.com" (the raw value), not a hash.

  • Backend accepts the request (202)
    Server responded with STATUS: 202 for the subscription creation.


Removal — Post-Hydration (same session)

  • Remove after add (same session)
    Added an email, then removed it via the demo app UI. Confirmed the EMAIL entry disappeared from SharedPreferences:
    adb shell run-as com.onesignal.sdktest cat /data/data/com.onesignal.sdktest/shared_prefs/OneSignal.xml | rg EMAIL
    # (no output — EMAIL subscription removed)

Removal — Pre-Hydration (cold start, hash-aware comparison)

  • Remove before hydration completes
    Added another1@email.com, force-killed the app. Modified MainApplication.kt to call OneSignal.User.removeEmail("another1@email.com") immediately after initWithContext() (before RefreshUserOperationExecutor runs). Rebuilt and relaunched:
    ./gradlew :app:installGmsDebug
    adb shell am start -n com.onesignal.sdktest/.ui.main.MainActivity
    Dumped SharedPreferences — the EMAIL entry was successfully removed, confirming the hash-aware comparison in removeEmailSubscription() correctly matched the raw input against the hashed model.address.

Cold Start + Hydration Cycle

  • Hash persists across cold start
    Added email, force-killed app, relaunched. Dumped SharedPreferences — address is still a 64-char hash (loaded from disk, not reverted to plain text).

  • Hash remains consistent after hydration re-persist
    After RefreshUserOperationExecutor hydrated the model with the raw value from the server, the next persist cycle re-hashed it. Dumped SharedPreferences — same 64-char hash as before (no double-hashing).

  • Server response contains raw email
    Logcat for the GET /users/by/onesignal_id/... response shows:

    "token":"getter@email.com"
    

    Confirming the server sends back the real email, which hydrates subscriptionModel.address.


Upgrade Migration (plain text → hashed)

  • Step 1 — Install base branch build (5.7-main), add email

    git checkout 5.7-main
    cd OneSignalSDK && ./gradlew :app:installGmsDebug

    Added upgrade@test.com via demo app. Dumped SharedPreferences:

    "address":"upgrade@test.com"
    

    Plain text confirmed on the base branch.

  • Step 2 — Install feature branch over the top (no uninstall)

    git checkout sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml
    cd OneSignalSDK && ./gradlew :app:installGmsDebug

    Launched the app. The first persist cycle hashed the existing plain-text address. Dumped SharedPreferences:

    "address":"9b2506faa92a13822f34bdd50c78820b04bb310bb7b4f41775b4ac361558c9bd"
    

    Migration from plain text to hash confirmed.

  • Step 3 — Removal works after migration
    Removed upgrade@test.com via demo app. Dumped SharedPreferences:

    adb shell run-as com.onesignal.sdktest cat /data/data/com.onesignal.sdktest/shared_prefs/OneSignal.xml | rg EMAIL
    # (no output — EMAIL subscription removed)

    Hash-aware removal works correctly on migrated data.


Not Tested / Known Limitations

  • Public getter pre-hydration windowIEmailSubscription.email returns "" when the underlying address is a hash. Could not be directly observed from the demo app because the public API (IUserManager) does not expose email/SMS subscription collections, and the hydration window is ~700ms (too fast for the demo UI to render).

@sherwinski sherwinski force-pushed the sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml branch 2 times, most recently from 325ab16 to ad33267 Compare April 16, 2026 21:56
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 16, 2026

📊 Diff Coverage Report

Diff Coverage Report (Changed Lines Only)

Gate: aggregate coverage on changed executable lines must be ≥ 80% (JaCoCo line data for lines touched in the diff).

Changed Files Coverage

  • PIIHasher.kt: 5/5 touched executable lines (100.0%) (25 touched lines in diff)
  • ModelStore.kt: 2/2 touched executable lines (100.0%) (11 touched lines in diff)
  • EmailSubscription.kt: 2/2 touched executable lines (100.0%) (5 touched lines in diff)
  • SmsSubscription.kt: 2/2 touched executable lines (100.0%) (5 touched lines in diff)
  • SubscriptionList.kt: 8/8 touched executable lines (100.0%) (16 touched lines in diff)
  • SubscriptionModelStore.kt: 6/6 touched executable lines (100.0%) (16 touched lines in diff)
  • SubscriptionManager.kt: 10/10 touched executable lines (100.0%) (13 touched lines in diff)

Overall (aggregate gate)

35/35 touched executable lines covered (100.0% — requires ≥ 80%)

📥 View workflow run

@sherwinski sherwinski force-pushed the sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml branch from a8482de to b79e1a1 Compare April 16, 2026 22:30
Standalone utility for hashing PII fields (email, phone number) before
persisting to SharedPreferences. Includes hash detection so already-hashed
values are not double-hashed.
Adds a protected open method that subclasses can override to transform
a model's JSON before it is written to SharedPreferences. The default
implementation returns the JSON unchanged, so this is a no-op for all
existing model stores.
Override transformJsonForPersistence in SubscriptionModelStore to SHA-256
hash the address field for EMAIL and SMS subscriptions before writing to
SharedPreferences. Push tokens are left unchanged. The in-memory model
always retains the raw value.
SubscriptionManager.removeEmail/removeSms and SubscriptionList.getByEmail/
getBySMS now compare against both the raw address and its SHA-256 hash so
lookups work whether the model is hydrated (raw) or loaded from disk
(hashed). Debug log in addSubscriptionToModels now logs the hash instead
of the raw address.
Between cold start and server hydration, model.address contains a
SHA-256 hash loaded from SharedPreferences. EmailSubscription.email and
SmsSubscription.number now return "" in this window instead of exposing
the opaque hash to app developers. Once RefreshUserOperationExecutor
hydrates the model, the real value is restored.
…rPersistence

The existing code accessed model.type directly, which throws a
NullPointerException when the model's type property hasn't been set
(e.g. in the ModelingTests deadlock test). Reading from the JSON
object via optString is null-safe and consistent with the rest of the
method.
@sherwinski sherwinski force-pushed the sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml branch from b79e1a1 to 3dc4882 Compare April 16, 2026 22:50
- PIIHasherTests: hash output format, determinism, known digest,
  isHashed detection for valid/invalid inputs
- SubscriptionModelStoreTests: persist hashes email/SMS but not push,
  idempotent on already-hashed values, in-memory model stays raw
- SubscriptionManagerTests: hash-aware removal for email/SMS when
  model.address is hashed (pre-hydration), public getter returns
  empty string for hashed addresses, getByEmail/getBySMS find
  subscriptions with hashed addresses
@fadi-george
Copy link
Copy Markdown
Contributor

Ill defer for ARs final review

json: JSONObject,
): JSONObject {
val type = json.optString("type", "")
if (type.isEmpty() || type == SubscriptionType.PUSH.toString()) return json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we hash the PUSH token as well? Would that cause us any issues?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that would cause issues. On cold start (before server hydration), model.address holds whatever was loaded from SharedPreferences. If the push token were hashed, any network operation during that window would send the hash instead of the real FCM/HMS token, which could break push delivery.

Take CreateSubscriptionOperation for example, which sends model.address in the POST/PATCH request body to the API. If the push token were hashed on disk and loaded back as a hash on cold start, we would be sending the hashed value to the backend.

Is there a reason to hash the push token? It's not as personally-identifiable as email or phone number.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok makes sense. I was assuming that just like email and phone number these values are not really required and a one way hash would be applicable there as well. But we are using the token to Create the Subscription then yeah, we cant.

Yeah I was trying to push more so on the consistency side.

get() = model.address
get() {
val address = model.address
return if (PIIHasher.isHashed(address)) "" else address
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this can we pass an Enum back.
Something like

Hash("value")
Address("value")
Unknown

then its more clear and explicit

json: JSONObject,
): JSONObject {
val type = json.optString("type", "")
if (type.isEmpty() || type == SubscriptionType.PUSH.toString()) return json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok makes sense. I was assuming that just like email and phone number these values are not really required and a one way hash would be applicable there as well. But we are using the token to Create the Subscription then yeah, we cant.

Yeah I was trying to push more so on the consistency side.

get() = model.address
get() {
val address = model.address
return if (PIIHasher.isHashed(address)) "" else address
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, these Interface are exposed in the IUserManager - would cause a breaking change. I think we should be with this although its might not super clear IMHO for a developer.

Can we update documentation that states -
Before hydration

  • if null then this is what it means

After hydration

  • if null then this is what it means

@sherwinski sherwinski merged commit b36971a into 5.7-main Apr 21, 2026
3 checks passed
@sherwinski sherwinski deleted the sherwin/sdk-4202-sensitive-user-data-stored-in-plain-text-onesignalxml branch April 21, 2026 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants