-
Notifications
You must be signed in to change notification settings - Fork 377
feat: hash PII (email/SMS) in SharedPreferences at rest #2620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| package com.onesignal.common | ||
|
|
||
| import java.security.MessageDigest | ||
|
|
||
| /** | ||
| * Deterministic SHA-256 hashing for PII fields (email, phone number) so that | ||
| * sensitive data stored in SharedPreferences is not readable in plain text on | ||
| * rooted devices or via ADB backup. | ||
| * | ||
| * The hash is hex-encoded and always 64 characters long. | ||
| */ | ||
| object PIIHasher { | ||
| private const val SHA256_HEX_LENGTH = 64 | ||
| private val SHA256_HEX_REGEX = Regex("^[a-f0-9]{$SHA256_HEX_LENGTH}$") | ||
|
|
||
| /** Returns the lowercase hex-encoded SHA-256 hash of [value]. */ | ||
| fun hash(value: String): String { | ||
| val digest = MessageDigest.getInstance("SHA-256") | ||
| val bytes = digest.digest(value.toByteArray(Charsets.UTF_8)) | ||
| return bytes.joinToString("") { "%02x".format(it) } | ||
| } | ||
|
|
||
| /** Returns `true` if [value] looks like a 64-char lowercase hex SHA-256 digest. */ | ||
| fun isHashed(value: String): Boolean = SHA256_HEX_REGEX.matches(value) | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,11 +1,15 @@ | ||
| package com.onesignal.user.internal | ||
|
|
||
| import com.onesignal.common.PIIHasher | ||
| import com.onesignal.user.internal.subscriptions.SubscriptionModel | ||
| import com.onesignal.user.subscriptions.IEmailSubscription | ||
|
|
||
| internal class EmailSubscription( | ||
| model: SubscriptionModel, | ||
| ) : Subscription(model), IEmailSubscription { | ||
| override val email: String | ||
| get() = model.address | ||
| get() { | ||
| val address = model.address | ||
| return if (PIIHasher.isHashed(address)) "" else address | ||
| } | ||
|
Check failure on line 14 in OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/EmailSubscription.kt
|
||
|
Comment on lines
10
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 After a cold restart, IEmailSubscription.email (and ISmsSubscription.number) returns "" instead of the user's address until server hydration completes, because addresses are now loaded from SharedPreferences in hashed form and EmailSubscription.email returns "" whenever PIIHasher.isHashed(address) is true. For offline-first apps or any scenario where network is unavailable at startup, hydration may never occur, leaving these properties permanently empty — and the PR description's claim of "No public API changes" is inaccurate. Callers should use subscriptions.emails.count() > 0 to check for subscription existence, and the IEmailSubscription.email / ISmsSubscription.number API docs should be updated to document the empty-string semantics. Extended reasoning...What the bug is and how it manifests This PR hashes email and SMS addresses before writing them to SharedPreferences (in SubscriptionModelStore.transformJsonForPersistence). After a cold restart, ModelStore.load() reads the persisted JSON back into memory — but since only the hash was stored, SubscriptionModel.address is initialized with the SHA-256 hex string. EmailSubscription.email (EmailSubscription.kt:13) then checks PIIHasher.isHashed(address) and returns "" if true. The result: the public property IEmailSubscription.email returns an empty string from process start until server hydration replaces the model with the plaintext address. The specific code path that triggers it
Why existing code does not prevent it The hashing is intentionally done at the serialization boundary so the in-memory model retains the plaintext value during the session. However, after a cold restart the in-memory model is rebuilt exclusively from the persisted (hashed) JSON — there is no mechanism to keep or recover the plaintext before server hydration. What the impact would be Any developer code that reads email.email or sms.number before hydration completes will receive an empty string: account-management UIs will show blank addresses; existence checks like if (email.email != "") will silently skip valid subscriptions; analytics or logging that captures the address will capture nothing. For offline-first apps — or apps whose network calls fail at startup — hydration never runs, leaving these properties permanently empty for the session. The PR checklist marks "No public API changes" as satisfied, but this IS a behavioral change to the IEmailSubscription and ISmsSubscription public contracts. Addressing the refutation The refuter correctly notes this is intentional design (SHA-256 is irreversible, so returning "" is the only reasonable choice), that the hydration window is typically short for online apps, and that checking email.email != "" for existence is a misuse pattern. All of this is true. However, the question is not whether the design is sound internally — it is whether the public API contract has changed without documentation. The IEmailSubscription.email KDoc says "The email address notifications will be sent to for this subscription", with no mention of returning "" for a hashed/pending state. The PR own unit test ("email getter returns empty string when address is hashed") proves the behavior is real and intentional, which makes the absence of public API documentation even more notable. Intentional design choices that change observable public API behavior still need to be documented. How to fix it The code logic itself may be acceptable as-is. The required fixes are: (1) update the IEmailSubscription.email and ISmsSubscription.number KDoc to state that the property returns "" when the address is pending server hydration after a cold restart; (2) document the recommended alternative for checking subscription existence: subscriptions.emails.count() > 0 or subscriptions.getByEmail(rawEmail); (3) update the PR checklist to accurately reflect that this is a public API behavioral change. Step-by-step proof
|
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,11 +1,15 @@ | ||
| package com.onesignal.user.internal | ||
|
|
||
| import com.onesignal.common.PIIHasher | ||
| import com.onesignal.user.internal.subscriptions.SubscriptionModel | ||
| import com.onesignal.user.subscriptions.ISmsSubscription | ||
|
|
||
| internal class SmsSubscription( | ||
| model: SubscriptionModel, | ||
| ) : Subscription(model), ISmsSubscription { | ||
| override val number: String | ||
| get() = model.address | ||
| get() { | ||
| val address = model.address | ||
| return if (PIIHasher.isHashed(address)) "" else address | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,7 @@ | ||
| package com.onesignal.user.internal.subscriptions | ||
|
|
||
| import com.onesignal.common.PIIHasher | ||
| import com.onesignal.user.internal.Subscription | ||
| import com.onesignal.user.subscriptions.IEmailSubscription | ||
| import com.onesignal.user.subscriptions.IPushSubscription | ||
| import com.onesignal.user.subscriptions.ISmsSubscription | ||
|
|
@@ -30,15 +32,27 @@ class SubscriptionList(val collection: List<ISubscription>, private val _fallbac | |
|
|
||
| /** | ||
| * Retrieve the Email subscription with the matching email, if there is one. | ||
| * Compares against the underlying model address (raw or hashed) so lookups | ||
| * work both before and after server hydration. | ||
| */ | ||
| fun getByEmail(email: String): IEmailSubscription? { | ||
| return emails.firstOrNull { it.email == email } | ||
| val hashed = PIIHasher.hash(email) | ||
| return emails.firstOrNull { | ||
| val address = (it as Subscription).model.address | ||
| address == email || address == hashed | ||
| } | ||
|
Comment on lines
+39
to
+43
|
||
| } | ||
|
|
||
| /** | ||
| * Retrieve the SMS subscription with the matching SMS number, if there is one. | ||
| * Compares against the underlying model address (raw or hashed) so lookups | ||
| * work both before and after server hydration. | ||
| */ | ||
| fun getBySMS(sms: String): ISmsSubscription? { | ||
| return smss.firstOrNull { it.number == sms } | ||
| val hashed = PIIHasher.hash(sms) | ||
| return smss.firstOrNull { | ||
| val address = (it as Subscription).model.address | ||
| address == sms || address == hashed | ||
| } | ||
|
Comment on lines
+52
to
+56
|
||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| package com.onesignal.common | ||
|
|
||
| import io.kotest.core.spec.style.FunSpec | ||
| import io.kotest.matchers.shouldBe | ||
| import io.kotest.matchers.string.shouldHaveLength | ||
| import io.kotest.matchers.string.shouldMatch | ||
|
|
||
| class PIIHasherTests : FunSpec({ | ||
|
|
||
| test("hash produces 64-char lowercase hex string") { | ||
| val result = PIIHasher.hash("test@example.com") | ||
| result shouldHaveLength 64 | ||
| result shouldMatch Regex("^[a-f0-9]{64}$") | ||
| } | ||
|
|
||
| test("hash is deterministic") { | ||
| PIIHasher.hash("test@example.com") shouldBe PIIHasher.hash("test@example.com") | ||
| } | ||
|
|
||
| test("hash produces different output for different input") { | ||
| val hash1 = PIIHasher.hash("user1@example.com") | ||
| val hash2 = PIIHasher.hash("user2@example.com") | ||
| (hash1 != hash2) shouldBe true | ||
| } | ||
|
|
||
| test("hash matches known SHA-256 digest") { | ||
| // SHA-256 of "hello" is well-known | ||
| PIIHasher.hash("hello") shouldBe "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824" | ||
| } | ||
|
|
||
| test("isHashed returns true for valid 64-char hex string") { | ||
| val hashed = PIIHasher.hash("test@example.com") | ||
| PIIHasher.isHashed(hashed) shouldBe true | ||
| } | ||
|
|
||
| test("isHashed returns false for plain email") { | ||
| PIIHasher.isHashed("test@example.com") shouldBe false | ||
| } | ||
|
|
||
| test("isHashed returns false for phone number") { | ||
| PIIHasher.isHashed("+15558675309") shouldBe false | ||
| } | ||
|
|
||
| test("isHashed returns false for empty string") { | ||
| PIIHasher.isHashed("") shouldBe false | ||
| } | ||
|
|
||
| test("isHashed returns false for uppercase hex") { | ||
| val upper = PIIHasher.hash("test").uppercase() | ||
| PIIHasher.isHashed(upper) shouldBe false | ||
| } | ||
|
|
||
| test("isHashed returns false for 63-char hex string") { | ||
| PIIHasher.isHashed("a".repeat(63)) shouldBe false | ||
| } | ||
|
|
||
| test("isHashed returns false for 65-char hex string") { | ||
| PIIHasher.isHashed("a".repeat(65)) shouldBe false | ||
| } | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 This PR hashes email/SMS addresses in
SubscriptionModelStorebut leavesOperationModelStoreunaddressed: pendingCreateSubscriptionOperationandUpdateSubscriptionOperationobjects contain a rawaddressfield populated frommodel.address(plaintext email/phone), and sinceOperationModelStoredoes not overridetransformJsonForPersistence, these are serialized to SharedPreferences underMODEL_STORE_PREFIX + "operations"with no hashing. On a rooted device or via ADB backup the plaintext PII is still fully exposed, leaving the stated security goal only half-achieved. Note: fixing this is non-trivial because operations must retain the raw address to execute the API call (SHA-256 is irreversible).Extended reasoning...
What the bug is and how it manifests
The PR introduces a
transformJsonForPersistencehook inModelStoreand overrides it inSubscriptionModelStoreto SHA-256-hash email and SMS addresses before writing to SharedPreferences. However, a second persistence path exists:OperationModelStore, which queues pending API operations such asCreateSubscriptionOperationandUpdateSubscriptionOperation.OperationModelStoreextendsModelStore<Operation>but does not overridetransformJsonForPersistence, so the base no-op implementation is used and the raw address field in those operation objects is written to SharedPreferences unchanged.The specific code path that triggers it
SubscriptionModelStoreListener.getAddOperation()(line 29) andgetUpdateOperation()(line 52) each passmodel.address— the raw, unhashed email or phone number — toCreateSubscriptionOperation/UpdateSubscriptionOperation.OperationModelStore(line 32 ofOperationModelStore.kt), whosepersist()call iterates all pending operations and callstransformJsonForPersistence(model, model.toJSON()). Because there is no override, the JSON is returned unchanged.addressfield — is saved to SharedPreferences underMODEL_STORE_PREFIX + "operations".Why existing code does not prevent it
The PR's own test
"persist keeps in-memory model address as raw value"explicitly confirms that in-memorySubscriptionModel.addressstays raw afteradd(). The hashing is a JSON-output-only transformation applied only bySubscriptionModelStore.transformJsonForPersistence.OperationModelStorehas no such override, so it serialises whateverOperation.toJSON()produces — which includes the rawaddressstring.Impact
Any device that is rooted, has ADB backup enabled, or has been analysed with a forensic tool would expose plaintext email addresses and phone numbers stored in the
"operations"SharedPreferences key. This directly contradicts the PR's stated goal of keeping PII out of SharedPreferences in plain text. The window of exposure is any time an operation is pending — between when the user adds/updates a subscription and when the network call completes and the operation is dequeued.Concrete step-by-step proof
OneSignal.User.addEmail("victim@example.com").SubscriptionManager.addSubscriptionToModels(EMAIL, "victim@example.com")creates aSubscriptionModelwithaddress = "victim@example.com"and calls_subscriptionModelStore.add(model).SubscriptionModelStoreListener.onModelAddedfires, callinggetAddOperation(model)(line 29 ofSubscriptionModelStoreListener.kt) which passesmodel.address(="victim@example.com") intoCreateSubscriptionOperation.OperationModelStore;persist()runs using the no-op basetransformJsonForPersistence.{..."address":"victim@example.com"...}underMODEL_STORE_PREFIX + "operations".adb backupor a/data/data/read on a rooted device exposes the plaintext email.How to fix it
Fixing this is non-trivial because
SubscriptionOperationExecutorreads theaddressfield from the persisted JSON to construct the API request body (the value must be the actual email/phone number, not its hash). A possible approach: store a hashed surrogate in the persisted operation JSON but keep the raw address in a separate in-memory-only field, rehydrating it from the liveSubscriptionModelwhen the operation is executed — or at minimum document this as a known gap.