feat: hash PII (email/SMS) in SharedPreferences at rest#2614
Conversation
5c2841e to
322b181
Compare
| get() = model.address | ||
| get() { | ||
| val address = model.address | ||
| return if (PIIHasher.isHashed(address)) "" else address |
There was a problem hiding this comment.
why return empty, would that affect getBySMS check?
There was a problem hiding this comment.
getBySMS reads model.address directly (not the public getter), so there's no impact. The empty return is to protect against leaking a hash to app developers during the cold-start-to-hydration window. I opted for this approach to signal to devs that the value isn't available yet as opposed to providing a confusing value.
There was a problem hiding this comment.
instead of this can we pass an Enum back.
Something like
Hash("value")
Address("value")
Unknown
then its more clear and explicit
There was a problem hiding this comment.
I agree with you, but I'd want to clarify one thing first.
ISmsSubscription and IEmailSubscription expose public getters for email and sms, so passing an enum back would technically be a breaking change. The wrinkle here is that the getters are not reachable because IUserManager doesn't expose them, although I believe we do access those values directly in the demo app.
Given this is the case, would we still be ok with that approach?
There was a problem hiding this comment.
ah right, these Interface are exposed in the IUserManager - would cause a breaking change. I think we should be with this although its might not super clear IMHO for a developer.
Can we update documentation that states -
Before hydration
- if null then this is what it means
After hydration
- if null then this is what it means
Manual Testing StepsSetup# Build and install the demo app (from OneSignalSDK/ directory)
./gradlew :app:installGmsDebug
# Useful commands for inspecting state
adb shell run-as com.onesignal.sdktest cat /data/data/com.onesignal.sdktest/shared_prefs/OneSignal.xml | rg address
adb shell run-as com.onesignal.sdktest cat /data/data/com.onesignal.sdktest/shared_prefs/OneSignal.xml | rg EMAIL
adb logcat -s "OneSignal" | grep -E "SDK-4202|addSubscription|RefreshUserOperationExecutor|Request Sent"
# Force-kill the app
adb shell am force-stop com.onesignal.sdktest
# Relaunch the app
adb shell am start -n com.onesignal.sdktest/.ui.main.MainActivityCore Hashing
Removal — Post-Hydration (same session)
Removal — Pre-Hydration (cold start, hash-aware comparison)
Cold Start + Hydration Cycle
Upgrade Migration (plain text → hashed)
Not Tested / Known Limitations
|
325ab16 to
ad33267
Compare
📊 Diff Coverage ReportDiff Coverage Report (Changed Lines Only)Gate: aggregate coverage on changed executable lines must be ≥ 80% (JaCoCo line data for lines touched in the diff). Changed Files Coverage
Overall (aggregate gate)35/35 touched executable lines covered (100.0% — requires ≥ 80%) |
a8482de to
b79e1a1
Compare
Standalone utility for hashing PII fields (email, phone number) before persisting to SharedPreferences. Includes hash detection so already-hashed values are not double-hashed.
Adds a protected open method that subclasses can override to transform a model's JSON before it is written to SharedPreferences. The default implementation returns the JSON unchanged, so this is a no-op for all existing model stores.
Override transformJsonForPersistence in SubscriptionModelStore to SHA-256 hash the address field for EMAIL and SMS subscriptions before writing to SharedPreferences. Push tokens are left unchanged. The in-memory model always retains the raw value.
SubscriptionManager.removeEmail/removeSms and SubscriptionList.getByEmail/ getBySMS now compare against both the raw address and its SHA-256 hash so lookups work whether the model is hydrated (raw) or loaded from disk (hashed). Debug log in addSubscriptionToModels now logs the hash instead of the raw address.
Between cold start and server hydration, model.address contains a SHA-256 hash loaded from SharedPreferences. EmailSubscription.email and SmsSubscription.number now return "" in this window instead of exposing the opaque hash to app developers. Once RefreshUserOperationExecutor hydrates the model, the real value is restored.
…rPersistence The existing code accessed model.type directly, which throws a NullPointerException when the model's type property hasn't been set (e.g. in the ModelingTests deadlock test). Reading from the JSON object via optString is null-safe and consistent with the rest of the method.
b79e1a1 to
3dc4882
Compare
- PIIHasherTests: hash output format, determinism, known digest, isHashed detection for valid/invalid inputs - SubscriptionModelStoreTests: persist hashes email/SMS but not push, idempotent on already-hashed values, in-memory model stays raw - SubscriptionManagerTests: hash-aware removal for email/SMS when model.address is hashed (pre-hydration), public getter returns empty string for hashed addresses, getByEmail/getBySMS find subscriptions with hashed addresses
|
Ill defer for ARs final review |
| json: JSONObject, | ||
| ): JSONObject { | ||
| val type = json.optString("type", "") | ||
| if (type.isEmpty() || type == SubscriptionType.PUSH.toString()) return json |
There was a problem hiding this comment.
what if we hash the PUSH token as well? Would that cause us any issues?
There was a problem hiding this comment.
Yes, I think that would cause issues. On cold start (before server hydration), model.address holds whatever was loaded from SharedPreferences. If the push token were hashed, any network operation during that window would send the hash instead of the real FCM/HMS token, which could break push delivery.
Take CreateSubscriptionOperation for example, which sends model.address in the POST/PATCH request body to the API. If the push token were hashed on disk and loaded back as a hash on cold start, we would be sending the hashed value to the backend.
Is there a reason to hash the push token? It's not as personally-identifiable as email or phone number.
There was a problem hiding this comment.
Ok makes sense. I was assuming that just like email and phone number these values are not really required and a one way hash would be applicable there as well. But we are using the token to Create the Subscription then yeah, we cant.
Yeah I was trying to push more so on the consistency side.
| get() = model.address | ||
| get() { | ||
| val address = model.address | ||
| return if (PIIHasher.isHashed(address)) "" else address |
There was a problem hiding this comment.
instead of this can we pass an Enum back.
Something like
Hash("value")
Address("value")
Unknown
then its more clear and explicit
| json: JSONObject, | ||
| ): JSONObject { | ||
| val type = json.optString("type", "") | ||
| if (type.isEmpty() || type == SubscriptionType.PUSH.toString()) return json |
There was a problem hiding this comment.
Ok makes sense. I was assuming that just like email and phone number these values are not really required and a one way hash would be applicable there as well. But we are using the token to Create the Subscription then yeah, we cant.
Yeah I was trying to push more so on the consistency side.
| get() = model.address | ||
| get() { | ||
| val address = model.address | ||
| return if (PIIHasher.isHashed(address)) "" else address |
There was a problem hiding this comment.
ah right, these Interface are exposed in the IUserManager - would cause a breaking change. I think we should be with this although its might not super clear IMHO for a developer.
Can we update documentation that states -
Before hydration
- if null then this is what it means
After hydration
- if null then this is what it means
Description
One Line Summary
Hash email and SMS subscription addresses (SHA-256) before writing to SharedPreferences so PII is not stored in plain text on device.
Details
Motivation
Email addresses and phone numbers are stored in plain text in
OneSignal.xmlunder theMODEL_STORE_subscriptionskey. On rooted devices or via ADB backup, this PII is directly readable. This PR hashes those values at the serialization boundary so the on-disk representation is opaque.Scope
SubscriptionModel.addressalways holds the raw value at runtime (after server hydration). Network requests continue to send the real email/phone to the backend API.IEmailSubscription.emailandISmsSubscription.numbergetters return the raw value post-hydration, or an empty string during the brief cold-start window beforeRefreshUserOperationExecutorrestores the real value.How it works
PIIHasher— new utility: deterministic SHA-256 hashing andisHashed()detection (64-char lowercase hex).ModelStore.transformJsonForPersistence()— new protected hook called duringpersist(). Default is identity; subclasses can override to transform JSON before it hits SharedPreferences.SubscriptionModelStore— overrides the hook to hash theaddressfield for EMAIL and SMS subscriptions. Skips PUSH. Skips already-hashed values (idempotency guard against double-hashing on restart).SubscriptionManager.removeEmailSubscription(),removeSmsSubscription(),SubscriptionList.getByEmail(), andgetBySMS()now compare the raw input against both the raw and hashedmodel.address, so lookups work whether the model is pre- or post-hydration.SubscriptionManager.addSubscriptionToModels()debug log now prints the hash instead of the raw address.Trade-off
Between cold start and server hydration (~0.5–1s),
model.addresscontains a hash loaded from SharedPreferences. During this window, public getters return an empty string. OnceRefreshUserOperationExecutorhydrates the model from the server response, the real value is restored and persisted as a new hash.Note: The
IEmailSubscription.emailandISmsSubscription.numbergetters exist but are not reachable through the current public API (IUserManagerdoes not expose email/SMS subscription collections). If a future API exposes them, they will return an empty string during the brief cold-start to hydration window.Testing
Unit testing
All 560 existing unit tests pass. The
ModelingTestsdeadlock test exercised the newtransformJsonForPersistencepath — an initial NPE (accessingmodel.typeon an uninitialized model) was caught and fixed by reading the type from the JSON object instead.New tests were added for the following:
isHashed detection for valid/invalid inputs
idempotent on already-hashed values, in-memory model stays raw
model.address is hashed (pre-hydration), public getter returns
empty string for hashed addresses, getByEmail/getBySMS find
subscriptions with hashed addresses
Manual testing
Tested on a Pixel emulator (API 35) with the OneSignal demo app:
removeEmail()call immediately afterinitWithContext(beforeRefreshUserOperationExecutorruns). The hash-aware comparison matched the raw input against the hashed stored value → subscription removed correctly.5.7-mainbuild, added email (stored as plain text). Installed feature branch build over the top (no uninstall). On first launch, the plain-text address was automatically hashed during the first persist cycle.removeEmail()with the raw email successfully removed the now-hashed subscription.Affected code checklist
Checklist
Overview
Testing
Final pass