From 1fed637e692616ccf0070d21fd2024de85dd8394 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 9 Dec 2022 13:41:51 +0000 Subject: [PATCH 01/49] Add reader's privacy specification --- reader-privacy.md | 120 ++++++++++++++++++++++++++++++++ resources/readers-privacy-1.png | Bin 0 -> 38683 bytes 2 files changed, 120 insertions(+) create mode 100644 reader-privacy.md create mode 100644 resources/readers-privacy-1.png diff --git a/reader-privacy.md b/reader-privacy.md new file mode 100644 index 0000000..fcc6bde --- /dev/null +++ b/reader-privacy.md @@ -0,0 +1,120 @@ +# Reader Privacy Preservation +​ +![wip](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) +​ +**Author(s)**: +​ + +​ +- [Andrew Gillis](https://github.com/gammazero) +- [Ivan Schasny](https://github.com/ischasny) +- [Masih Derkani](https://github.com/masih) +- [Will Scott](https://github.com/willscott) +​ +**Maintainer(s)**: +​ +- [Andrew Gillis](https://github.com/gammazero) +- [Ivan Schasny](https://github.com/ischasny) +- [Masih Derkani](https://github.com/masih) +- [Will Scott](https://github.com/willscott) +​ +* * * +​ +**Abstract** +​ +The lookup APIs provided by IPNI nodes are able to observe what data is being accessed by the clients. +This is true regardless of whether the data itself is public or not. Because IPNI nodes continuously +catalogue the content hosted by all the providers, and provide a central lookup API the need for +reader privacy is amplified. This makes IPNI a difficult choice as an alternative routing system in +projects such as IPFS, which use a more decentrailsed routing system that by nature reduces the +possibility of mass query snooping. +​ +There is ongoing work on IPFS side to integrate a reader privacy technique, a.k.a, double hashing. +Building on top of the existing approach, this document specifies how a similar technique is applied +to IPNI in order to preserve the reader's privacy while continuing to facilitate low-latency +provider lookup. +​ +## Table of Contents +​ +- [Introduction](#introduction) +- [Background](#background) +- [Specification](#specification) + - [Security](#security) +- [Related Resources](#related-resources) +​ +## Introduction +​ +IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack +of privacy protections for the content routing subsystem. Currently neither readers (clients accessing files) +nor writers (hosts storing and distributing content) have much privacy with regard to content they publish or +consume. It is very easy for a content router node or a passive observer to learn which file is requested by +which client during the routing process, as the potential adversary easily learns about the requested `CID`. +A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. +This is obviously undesirable and has been for some time now a strong request from the community. + +The changes described in this specification introduce a IPNI Readres Privacy upgrade. It will prevent +passive observers from tracking user's actions as described above. It will also be a first step towards +fully private IPNI protocol that will eliminate indexers as centralised observers. + +### Non Goals + +* Writer's (publisher's) Privacy, which is going to be done as a separate specification; +* Client to Provider privacy, that is out of scope for the content routing subsystem. +​ +## Background +​ +Network indexers build their indexes by ingesting chains of Advertisements. Advertisement is a +construct that allows Storage Providers to publish their CIDs in bulk (FIL deals) instead of doing +that individually for each CID. A group of CIDs is represented by a unique ContextID as can be seen +on the diagram below: +​ +![Index building flow](resources/readers-privacy-1.png) + +## Specification +​ +This specification focuses on improving the **step #3** where a client has to pass a CID to the indexer *in open* +to get a list of providers where the content can be fetched from. + +In order to protect the reader's privacy the proposal is to change the way how CID lookup works to the following: + +* A client who wants to do a lookup will calculate a hash over the CID (`hash(CID)`) and use it for the +lookup query (hence the name double hashing); +* In response to the hashed find request, the indexer will return a set of encrypted `ProviderRecordKey`s. +`ProviderRecordKey` will consist of two concatenated hashes - one over `peerID` and the other over `contextID`. +Each `ProviderRecordKey` will be encrypted with a key derived from the *original* CID value: +`enc(hash(peerID) || hash(contextID), CID)`, where `hash` is a hash over the value, `||` is concatenation +and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need +to get hold of the original CID that isn't revealed during the communication round; +* Using the original CID, the client would decrypt `ProviderRecordKey`s and then calculate another hash +over the decrypted `hash(peerID)` part of it. Using that hash for each `ProviderRecordKey` the client would do another lookup +to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about provider, +such as it's *peerID*, *multiaddresses*, *supported protocols* and etc. Each `ProviderRecord` will be encrypted +with a key derived from `hash(peerID)`. In order to make sense of that payload, a passive observer would need to +get hold of the decrypted `ProviderRecordKey` that isn't revealed during the communication round; +* Using the `hash(peerID)` from `ProviderRecordKey`s, the client would decrypt `ProviderRecord`s and then reach out to the +provider directly to fetch the desired content. + +By utilising such scheme only a party that knows original CID that is being looked up can decode the protocol, +however that CID is never revealed. + +### Security +​ +Security model of the Reader's Privacy proposal boils down to inability to *algorithmically* derive the original CID value for a +`hash(CID)` that is used for IPNI lookups. Right now indexer advertisments are not encrypted, but authenticated and contain plain CID values in them. +That is going to change once *Writer's Privacy* is implemented. Before that a sophisticated attacker could build a map of `hash(CID) -> CID` +by re-ingesting advertisements chain from each publisher so that they can use it to decrypt the protocol. +Doing that will require significant investment into infrastructure and will be eliminated as a possibility after *Writer's Privacy* upgrade. + +Reader's Privacy is a first step towards fully private content routing protocol. +​ +## Related Resources +​ +TODO: link to corresponding IPFS spec once materialised. +​ +* [Double Hashing and Content Routing](https://youtu.be/ZPIDU1-JnVc) +* [Duble Hashing as a way to increase reader privacy](https://youtu.be/VBlx-VvIZqU) +* [Deployment and transition options of Double Hashing](https://youtu.be/m-6_VZ8e1tk) +​ +## Copyright +​ +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). \ No newline at end of file diff --git a/resources/readers-privacy-1.png b/resources/readers-privacy-1.png new file mode 100644 index 0000000000000000000000000000000000000000..cd5488daec21ed33a7ef2bf4389c2da4e9b86ced GIT binary patch literal 38683 zcmdqHWl&sC@HV=H07J$|SUH@- zF4{V}L&Bq#G>vBF7AmUi8=Bg7p?ki8q1r}P*?EOx(kh}-O0kJ)dM4KLs(N-V-k(); zt81GQQZkj)3_^S&TMAlS+X`DZ zA=~?XklU#<6C?Y9^1l4Ip9!I<)(#$J6}3re**<}x=GM--##UvOwX#au@@jh1v-6G3 z?fE}Tg2TQChD7)WhUNY!j*3h64-R*5^*uX>X~#;SwhKVBRFabce8-@aM>Sr$%NuwB z0HnG9{m>MA!qZTlSl$XMvRM1rl-QK71)*PVP&NZ7$Vh4Xuby?i>|PgY*jxg)<(a7+Y!m`#VV%yJ0kHV8+b0iV>~Y7>17>&lmq3vO={dxDH>2)+z6+NITJ(Wd;* zNB;*ttuIS_1k6%$cLrUQw7O2}7UFr%$eV-KbOC@`wn`pXaB;g-8j(|o6}*E&8Z-s# zC!Oo824y(T4tDlMJOg}{O%8}Y-H&}`Wwt*4?$I)P&7r*^)@!$ZNvF8|Q*#GQ7ByyX zTiIsZm|HRxgSm!>l7Ea=|sBJLGO8|ga&P8NTnbT~w(7wF=!)IL8 zZ(70!OAFF75g3yWqT13Hab8iYxY`5L{5D*Un|dnrjC=>$=P^>0^@SdAG4+5IB1V{U zht$d_F4@^Hp2p83;v&wg2VC{kC zE6{tf)g$WXIN6g;W->pr&k~)32A&%Y3Xr2@P*uu{KS}6DIku84lwp zv~+Xk#R{U19~*4M@N|@io&x~6p?ktvF*mY!IYE6jl|XJ%7tXY33q)s{7=QO-2i`puEzqIk(=WjOUSqR zxnAMTXx2)J>yW2Wzg2}OB1+lC)VzBM;EpYwRYs2P$>z8GJ1b3epayCIVSBi^&)HasSG($pM#^g9+oKpSSu z^c(lMlnXa8}JvVJmKz88zBT|dr{wfzm0dDv{0DwP(&!N6x@3&sw-)(E_ zGbF@I=Ar`5u`dYL8$OPj@k;7-?&(h>ex=Xr#!jaMd{(P)SynjIQsOrH;7pL6po)?E z>ZCbdz@xk+Eg_Hu^89nh_tx^ani)9KY#Xeo4Lm1M=EFN(u&9p31I6V?*c2y35MpXd zXDE`bH1@7bJv)DU^o8Z{&OSIf;X@ESfHI+?X?cB@o<67J-cq2SMJMjV_Xp2JGi^g? zuz(!%s&}gdK|nqg;eV3JF`8)EWz^?sCeS;=zMloapCxLN$#F?1>h3ut=Dh4HEP2Zc zlw1Kdg#YPRRr#Fdd@Nz7Q{N)wDLhQFqNf(cMjpo$lu1#LgIAyJ{1ljPNm)*0Xr zFNLy&Mf69y1_(QN&Nkr7AJN~wm$ULrV#cohyO9+{ zF->yZSSLa5dr#Im>VW^F6qlHgT^?&Tcqx<#$l(OMmkpC5V>J zmw}%+4uar3R(%-olL|C|IYwhN5v>yEt89CBQL^;Z*WZ?sjI*VBwb*4rI>uR)^Nq$w zv61${?+%*sbHx)1@D&9}_XGA)>Gia2A{ws%Uwu+_-iEGum;25~S+X+tZj- z4&jJDW`Rn78U|REOP&ZZGm+;X(I#VfH03>VSQjEhPvtCJ+mxym4PlagT}9Obdlq9k zWmlC$3}b0ARg9rrZJn=l)Z_{xia&WbVvhI$08UJT8@~s80TZ()`pWpTL`(D3zP*2w zUf9uEMOOZxMH9Cx)n-zloBE71O1z%KTIkLr{y}?~e#)qsZ&z79r;ltUu&e>UEj`Mi zHRT7s^O5f?^8p65T7RJ5aW2_Ts=1>q79XWWCOO#pH0pTOa16dd#-kBsoI-h2upq;O z`xaB&_N1BIt78ZcI6Y&Y?W!QrA(rc+U}8}-_CZhi{#YwB9Ua%_4t=M*>d?hPk;_;G zZlX#3!^_Oq`HQf2(SZ$PB4{O9j+^M{8wxT4N64+(WB9djnSK^Ac92XkOw8FbjQWie zO+CeI#ya$4x^O)Dj^w?Yi?{vH*VA07l-kK~d&~@ZYbDMZomukW=YoOxg`MaaS z2C6wI`k}4u+A1TU0I8o%rwc4D=XYOmGKpFWHEsU=Y5P!+DxPasN@=Np!fZK1ZR+RI zaJa#clf-8(G{8yeS?A$rG%khu@OawatP^!3YIdxH)w~9^(rU()n<}p@JUL#{XU-H= zz0y2)Y;0sXFG|ev7Eq%tIfnZD6!#&`{>;@U)uNro!Sqikvd8{=;@ImK0x8PLac-V`@RA4*J3bd5+x|S|Ln|F(q5~f)`wtGs=rOzOp8=U0E8*3MB zgsnqJ0nVMTi$FkB-_*g)k0fOJ2s`<$mE0>GmQW-@i$`=hi&Ul{ckJ zl^+WC;;mwN8So+N)78nJ!+)C5C2N@Ue}z20-!ko%9VL&YjIQ%o5(5BYXf@#j>K~kA zT$K;rY2kv%w9D6|M7-W{ITIX6Le#ibc112>=0*)3bWO5WuZHHVnZ_Dn#ezf=9eG9Q zF@#DE4+$k10oMdi@7=+FGCJGF-b0sXHQts5S-g?s(($iP+mRLum@Jl4olDL;dKYK> z44|tDYZp=jg94(Fg+8uXN4^x%&SlibD$Pf${QRL^><0YlXK@v6b0;Mkwjh#lMb0_! zPC}mX^;;D-?>_-U=nDjopPuJfL$uEdJyqj3_p@(488Z=XhL;${<41f3zUY&84Dg1! zNNeqHP}19?0eVu0K>?!NxCax@bfa6%Vk}A~2Yzba5k;u^`ul7vF(a>*P&`@SnQ_fYwrhR%4Pxy# zv!H@!@wuu}tp(1D{pS;>~(CqLTsfdyJ#;7G$GBgunrHT)sN{=&`d}XM);ogbb zN>-9Ouk~Aj4fQqE*_?=l9Nu+r6CKanyhXBuAc$#KRD1R8d|KO4WwQb?-{jYI1a4Wq znZr9Hce>^}2Al5fQn`brjBsIlu%5ZED<~JQUX&cy@aRbGr3>ZC3n-4p>?Rg!y2>az1k*y^bY@8Lcw-LP!|`_))$wW`5x~&N3c( zA`=$wkP|d7zSgPW7F=Vb%%CtLoq<2u_H#}qds!olS4l--#88vm%XWge%MAc%C1c)= zXtI85OLpVAs$8+cKOrKPG>houVt7UswB48-XT#)s<*I2#NrgkHGxY11cKs{jG|lER zi+fDK*W9E2%VIi*1W>%ACt4w1-Mfihox=p`nD^Ggag1HJnwCguK9wm2k-6T4KUzY; z3Zfa%A!&LOFHv!9l<7Oa+E*UiE6-pceShJK2UF7R-7@Tb|1P)e zqDyJ{q@T(vGv@9e&7+n#yPDc3-f+Q~Oj)8ogO z^KKjUB=@v6~Ro+Dxp&JxTPy0hU;){Mv&?);r>(;FnFo0)-=8o@M%D$0B%*H~N;XlNr< zxEsY)U`Wl|Nz#A$xyWT|fD>PAF`V5FO|^bqrwf*8w8o^<5eoLWiPG1ywHm1ev9Qp- zRg4`+)fq0BM)?;}EBM+pZ9YqnI+3r_2Y;XrQ!Oz+llU`0&pVBG`6#FUZ}jx^4L2@k zcqrRdYq^(&w)}2#{So|tS}?O=%4u&KT{<*|v2AHdp~STL++V9ZKk(K8g;4?{-frv% zIV>w+?8mtcJ4nD(6{l5AmwU;?3?mlcYX?pJhPzx_qp7LSE!X`v7}-caEQE?D+&$63 zJ-&JGf1QUn4csUiTZzHFj-Jg|u>O*mj5$})!O437g*NX@dCDMbZ5BLTo62f*|0Hb5 z0VsZS#pwQFE5384`Wxl$hQpTy`RMxv$ z3v2=aPO~bx<2~+4x5kD=8gGW)0?bL=6C2mH615j}>|!iP)Yt(%x@A$3m~20SNfYI+ zqNRJ$P}JzA1orK`&)PF#@e``eUR)i=DtK(|$;Wxqdp61a3}CL3bm^zJe?FEJMnXFB z98ejWfqCv}kB}Gj9=w^7K*eqAE0J8gGWI##RyM%bk|NKnKyIleOcRtb=w`rC2E8*o zyzl;N!G~MDUWOxlTCu^>-2Q+b%QB^Fb~l+!P;V<2!2FFhvF^rFk`85o58&&;oGxh< zkkeY4^^E~B;6k~P4_9QMi|j=Ht`Jo1>h?d;1VoY#fR662>`%w9vC;# z&)&bn0r>0H6I`)FqQE@F;V%JS{XXG>Lep=%hCA@oJ_CA!esWyznOv&LW%Zx|5T$M+ zog3YJE@;@V09%uX&#ashX<+ZZp<@Bo@!!>b^OtOk@l2^#{F|yLAu&7(L;D39wZKH3@ZK znEIpx`tyqra6xR3H`0)3u{7@=_zbWa_ThCRb1D%geHP%WURX7PuJ-*;0Kil(5piqt z(*t$(Yd)si`rq;ZXxX)+_m!>y0O?p!856;`aj!4d7OKW`1kR z>ont*>}JI#t-hq4{?E+nRR#>ayXVc@l+{G(^7EB}@}Q`hG@n5yGL?I(A{Tw?1rug9 zSLVdBaDnfxAjj|4Au4-vGa(+aCKHacJ)`47W=yi$$BamfY?1THZ!zi4TElO!4k<)7 z!%tsIXKsyDj&)fiF8lm_E{haUttcRx3FOo>7~nmKn0=b8Pl@oLDLYfi^vEu-G)`f- zo7$KjGP6o@)3VVv`8>tL$Fkw>lU;ahDA!(aIa&W1nkjM{U$*Q4{yqc}-8Hl+Q+(o@ z*5AYbrmK6seDPzfj|KAYLID(HsK7~o&}aQ1%-;n&zM5GUC9?qdi;CyPlBlUUHu@I< zwd7XDwmz~s?lsU7Isf%roNkaueu!a&PmeT|l}=PDa4t!K4H(+AnVs5lO(Kqe6FgD; zoN|F+;u4EX{x(({(jClqI#P8CNm$k#J8({OYqoFrDacEgcC!2tKE1(p81fO^J6A{A zXpB}d4P$|z2a_Et)r#tCb}2qjyX*Mpc4CuFa=6uUP_3n1C39|CW&~4TM`yzH}*=U_*`}21asO_Yc9V(B=|+Pe->Wq zEvEZmMB+!ax0Z15Dbiz?00V#Zd`Te5?ZR0*qE?>W0^^QyPjtcPU!^RVoEfLF(su2G zQjq@XYUZBk=}>JvQ<aig{BOL(rqEQ0mhRAt2(NH#(;Zg%ww9;_bxW=JS|dGCP! zD=@sY9wbcLVTZg6ri0Ac5w0tLuAlsFv*P3#q%CdD1zGsZ|GT*T*Vs%(WkYH@!`Otj ziJ*f!8)Tcdh3I>35*DdiPeW*6O-#svFr$*>fDCVFk7cEzGH^WzrGKBu3bIU7E-x=7q5#*8< zy9m=lN?bf2R~@8CA0Kq)efQoPA8%qY$#7qKKZ;~8x8elTgR+n;-=cnBipe%6_}l)5 z`_|;(xb*FaK{r8K! zBaF|VT!WNiSf4E~HiOdH(u5n0<&TWR5-4PG4m7QN4D)I4bJ3jt)IFQE_7hhwI08&aDB)Bh?DkwfFni#ZFq(U z#?jG#%!igE?(xGb(n!5A&1*P9ZmC*hjz{gNp8>X5?Pb232nSB>y8(R1t4&A*!UmQx z0HW3_gpxfjZ-Z|Cp(eEA9e0JSx-x>jRyRc^J%dIBQ}*7S@0c&(_@8Y)ceb3xAqZA}II)@;Ugj6YntUBL)WuVMlTbN`Kz2%S;5z zf?=z$N$%@9+(0h#1(rB}m#B4)85|J7y(egD%ulpI%5KOeMJS|7*I_|3#IS0r5*3x; zh4uM-h7KHBj5)A>Zix?|R;(P)N^q^06K3Gd3i7pJ{KGCgNF|>Wo0ML^H*e#MbACqD z?WZ1QDyB*6E=06J@T>C&#wAkAGbD}<}Ws@qc|5DWx=gQNp z-c=sLI4E?$?lFi*xgdkqFB;4Aj99_tHHW3Nb!={@R**xBT9ndI|Yke zeCBYGXJlyt*=X)=pxrwsS+JLtqLJ@|Wl!TeX`x`1Yje+mNw%3nwL&w`{WR^vK)) z{GqXk7b;yMluTA83X7v6%;S`cE@QyQ#|tC;=Pn!5(5Nvxt~z@{I@a+sA;dKUp^Dz) z2qfyM`W=BQ6Gb7C6yg}^O&FSYHR{Le@xPOk4k2LG)l2AWc#l(2v2~_Y7$NeU)vNZ<&I*&uK>kNQR9Dm1QvuR78Q`-_=EheNTUwkY zEq+ohgUsMaH-6Lgd>WGijidO30K3O;tPVP(F)?v>QCEXD&n74qUd(8$Opb7k*qWH- z%kz`Tly_rhs*>XcYgSUpZq;UY``Vgn6(=5c#y^Hz7?@P)!{v|}@1T@l7sQ}UzMk)s z+-g4>U!Bqe{N%V5oecX-c0#xy3~A|Q4~D!8B!dQpk|++z;g`a@8OM0?vr7(c60vZX z2fY7PXff5lbRPy}9~RUX`w)_DWUa9QGv=7r03)#u^trpdsWF>vI_8J&^MKU$fzQeu zt;!o29#x=00@w$YKBgWuN+oWe5b`HOcw^uT^)(T>lnCCOXQ`>_#bx+p+4T zQ0H(=1I@(KQF!e3_WMjXr*>luNOCqXb4PMg@-IJ*E*s>Zho*hOsueB7{HXCeqE;hg z{!y1dHPU$&Wy{T9M#?eq2uX!Fmds%Va`){e%X`&6rHNv)`(2#n|hb=lA20WRdF@%-P(;(cq4q ziPM<-UWA>xFKgm7t8v77)nk+8`=j9#LNnz=4SN3*b5%-*IsuxS7Ifj{AZWwO?ZX6# z%DA1wx%NP%W5x!tCv$i1V_6kGjk0)E6OsCp?m^uRLroLO8X{MfcJ*|5g4(?uP z%UrRs@o4Ari5=KIwi;P(Z8q91jXmLjuR)=ch~~2QKZw7^hfN{|M~I=KF95B~hd{Ud zXW^Lu=TyvmSbVCVq!|{#Jm5MOj*SWX`X|p2AnHW^Y@aU*F%hJ%_uIFpq5$4?L|8z32v-@;Dc&9OHpIAfb>n_G<^I_y9oX(3yozg;!l7Pj)$LI047 z=Tf=`obZi%{Kf3Se|pV43A{#BtH9b1_yd2_6%Vd8jJh!-gDQ2(saP=j3K5S!i$+Y?EwmMteH8iMP$tG$Jvy;+tWtili~$Y9Dq?_~zAi&uM)!V3#!_@P!jh1y_5VD;uq zuS0cH6fhjCK28^RGavZoGcR|D&KW};-bhQea^aJcNtR-ZwS1X3rvo{#W}V+=U+~Z| zU+I$tch!b^W+S-dtVrFFz(h1FiJvpwt!BKjgDxGm?%h?bWU!oFG(|2=k4^*wzwAJC zaIh-LRG*A0&!tp31ah*@|0UBzHeD|dAHHUJ`fs0c&aANct$c^fq@-WOJ|}WWynTm% zzdyNJkhCueXBHRmMgOuzGLgC1Pb3?3T(j3l!Ye38TGG3kqk*y8Ch3ga6vfYE?_YFK z`y=MlYxI}!rO7_5_<@6}>$X#XDoU%>yzx6zD0=ihV}= zQzJaeoaCa2NSi0SnJ*JI%mR_+tq%}m6JC7Rad!GmDAP^woGZJzZ_sb|8Y;vB7hmgY zZBlo%v5Fb_BQ)Im9ND~k-sj4P;<}`@`KMT2U$G3xg6 zdmUM0g2GP$2CfNv9ZG)GT?9I2K6BG0H6>$~%t2g`fWJ6MO#EjaKmGcjeJ-og=bKyu zJEeAFV>)oC%=Wg@L7Lv$BKM~zY}Q()wR*L|L8)R8CS=ClZi`exZ^5u1ZRFc!aBrXf zxG~-ZAm$*-2cIBd^NImmd^ra+qn|t|X3sX2mObXjv-oZ=ow0Fis_G?tV>Gk>ih;cU z>zs^=r2K@z`znMqdHM>;o~xgOrC{kh7ZFs_8NKYedM+0CZxb!3C#Nf|23Sg(4d#yegxmW@Qzdb69-uteMM?nGXz~N#k;{Evu?CsFWE`e4>njj)GD8Z zA)Y^^4DNdvM*%#QYy&{g{DS40V6*9XG`($BS7|);b;LKx+3s{NxJDEsIM?`&Q}s`_ z`}>s3ejFDA3?*~3jDW!iU{kga?*Y@VoqC5gz4%si+rk8OFNt#rD)mKtS)SsU^~{8X z$Xlvc5sL6xokxpTq0f3c>c2<(Z)rFS@LxRs@t5BIATy7 zasxF%r3cDqa|p2hpGG7C0LQ(>vK@!v+~XZFWPY6`|8R`%p8PI2BRKT+OMr7kmG58= z2JyBq3i;9H8yooe33kzH-@Ny16Kp>Ii`#Sl(Q1=it$UXQi;VBpi?J`JY)Gmgw+|r2 zFrQGpqm4^%B=~U0j|Zbq`c?uPKM4J8&hbhn!6`8WSHoRwg<1`ttOFMbdh1A;=h~Y( z%#g79Wq0LJ%+WoeRIQ_vYZS%_Aqgk}I)#%FZbd)D@>EengXBJfr>+8#A?tN$?&IBY z$B*0v!3I%`>a#L_pXF0y2DyfD_X7hozjn7j83lAA^rPobWWoE_G zhTM`kbzJ(Jx5r4Ov&rV2ot$1Nn5ik|r4p@q0X_Kd?wI){La+xTW$)JHlJ};%^RL$|ZFVQ4gE%{c z?SX||Q&;o;)IR8+;IQDBlkYX(q6mJ#)>z%0wG;c|+s#B=O1~5shZx2mw6wvJe~bdB z{z;WXC`?!O6bN}bv7^S6S1&nQ&ncwsI??JaqWsp;g~IpS60b!iMXGfigLd&Jau>+M zfN}_Zx#M7n0!5-&GK|i-1YcTTuGvjo#Ol(bY_Ry8MTcc&ObY{!TktaeGOL^Ra3oA! zw#AoQb%A6+=P_ZYvSvhZ>GkP5A7Q2I5C%w5&BX5pv3Um<9~1DNXu8L?7cHEUTKY^1 zlQ(t$jnX2DLj(TBD-1z@2PKk$FKHF1I7+_IcU z4Ew#^`MZ2&;=3CBr2?q-<7|B58v>H$dVDL#hiw#01KApRV8=1$iW`x(oS4_k&-SjB zueFXc_2Q#$PfH#U;!^l+dm4fz>Srqh4sfo|_inMS>F%58f|<{_wzX&1eKRE-l_~jk zmYXRB?oFlLMPLR7ty!?Y>2G%;7Oshj&qQRHz#|Wh#ZzH_rqi;)SAk_BZ}u$-bdf^O zkxV6e;lKZeW~Q1L39eIjOM#69l78z`Qs%(gGaveE8}B{3vxR;*g+Ng?uhh%#^{4Zd zApn8%<&_TyT8>1(8 z3y$~=65*hexU4vpG^Dy>S;c8jMEyxULb1M;hl(#9@tHSOwM0m8AXVObskIp~Q$)7e z;^f>1YjF(Dmt{oW1yk`ywvE^U<1VB5Cx$K$djI0K`55(N3a<-l$;}t za=Ztfj|NhM3dD>ND%-74vLx8jAE|CMeTty0{~3q2jJf81T=7Qr!MB+%$VeC;usO{2 z-$0|}uBtpNfSWZ5-$hSc)$pe7B>Gpu`gErE&QLL`Ufpx#|I<;Me&zq%s7+TgfGSwT zil*VVUr^oXJ}b0vvw*T~27gstpVNNy-im$cn0czN#KTC~cf{jyU$H+~fO|td_Vjjg z#4K^4hFaP4Kt`#RR8b*BPEOi;zqmrVFpiM);@Bi6n&E&0M*}Q!o zO%j)xboUJvrIrAZSpCGjl|mKIjfLN787sO4YEQ=jwI7O4mGvljA+)Z;Ki}Le_rDdn zeSJ#V>;LHs(Ow=bWhZ9Wt&|YEoV7w)=ZFx=^jQ7qo58^`VQtnjcuT;T2i|l3vONS? z1IN##n`tkm+-&NR(WS0|3mt#|EUFHkQKi*q;Puz1mBaPz=gQ&DLT&e^29{!A9!=sN1EJQDR7qv}{<_i5q#rP#2aMNUjZ zAYyW?iIPz1Z~Qxj&oKNFRuw_U&KB+ZQfsGrPCBQvSQCDm_y4&IRbr38VQyFC-OEdM zJJTaFWQM!sfvTO!ViDKRx4>}wy!Ubibj)y-?4$Iw(b9)3ZL=}Ty*Ot9n?aL{kTkc+ z(h=<9avF$nZA$;Zpx+-KT0^Siw3PYF$ysAJbNtgH)^IYEpY^7G5siH#Vgx-Tqo9^6 z7n(c1n4WBH*Ep)s-I2qumHSJc5TyjW-oUk~wlRI&1u-M>`ZsI@$<#i1?$oQswAD86JVoaT#7k#d6=nm~kO?NW z-PB_(F<9Rk2}G@^Rmh5+_Y61KAW=Wk-Ix`ScV)+I@jEhL(d3zot2t^i5h1s(WqCYQ zmO_UHm7e}$4Nt2wu4ozdd+^-wwYLrIBydUuDl}xv<~iJPk(P#c!RTRsQ*1ah8UAuv)8x$2aj`2nqmXBxm8^{;0C8{ zZHW9M+wR%Pp3x0|gGLEFc-my!SaifsWy33nl`p}2wsE35FHv_fq5Y6j&F065-$CRS zb+!LieICfMH~3fezXLPeVoWGUMUJT#)173#KjcWQShyEEfYmux_N&={ z1+>-}YYgmk9gp$qMR9PmiC4P~-h|h5{rrZ(sbJ|9^n6}$PkicuMPzvnH* zTSnFLz}Qu)zY*Kx8OJUnr4t8sO4m9PG|^{7|C3e{<=&gHb8iGtAN-PErVsgLXY{Q5 zdjmvPpYW`cA8TX4oVNFNBMR1-{EA(gV_{Um8N|gEdt&liwY{oMzrbv!wEIc?%UU z>F=}6r!Q3QM5e!I`x{&YiQJ@>At<41U&!H#&Vv-94)1v~%AL($y~JcpafvL9?F*+I zFsHMUI%!A1g3WBU;S<;T#5emgKKGY@W8p@8qW4S&Ri2S@zBS`{gqcg%Q9DUuR&gJVx_EF|8*7Y;lGSu8H^anm>G zZ`{`)$QQH_CtLPEz8E^_NFhPb=u$C zk1cT>d}vkKp!cjF!As-1IXxSb;gJxE5}Bjla^j!+{KRyJz$9`PZxeJ(2K=jX6y(UW zyFUqBm_l=-K9Kve#f9cs?_R;r1P@u4^kxfHmD+?P2gbY(&!!fo!9OsdK@-j;FV~O6 zv>ikQa2qkyek+BkB{_L97fJjoe9)k5c8cXO4wO*Q-pNUQ(pp>T^=W(VgC6=oo&5=~8lHc+lR)PuZN zgpr^P`BFoJzH*k!NfUYq7Wr>ShI9%v||eKgD#6i;Iz|yecOKm_*B>`IEB=sA62rS}X&|h3 zLVu5&xHU908R#OnI3~#b#!S(MAcq<|eF_5ksBR9!{P?}RW%^YqT1<_Fkw=^QPe@h0k-wGQ56sub;xyR762l|X zOZvMfeNJtPnlBMBj+d$Ar_I0)vWRTeQhhChh0;uf(p*E6Es&zXR!caQ%BPCX3_oMq zStos(K^$HanC>Cmr&twr&wu3dKRBMS*($R)Hai5;b{CQ6jnB{x94triUUt9SVQt}N zFlNtF5IErEC=2AIZ*!%IDwu5ENn81ztEr+>U$1HW`Aqq`yR4Bb{7Jri2@1)M7j;pXF!)0Y>9E9CK@#Eog4D4kcn-qsaC%dk$%dA|l zrN&yX8!L@_K51`-(Ff>VNL?NnUeDkqTXHtOcXmAl3649SM6;qpFhuG|Sr zjNG>#Xn+k!*xRKqCmaace2K9N7p%g@-rs10h$C2^Eb8rr+x`v3UFD%kKwsi1KvqVH zWmQt#uv=Q3QhUg6?K_5n2&d@hKCB2|3XyS3s=d> zYITVP65o^|9f&hTu3m0BbYiyPY3L%19bW;Da{4QA9+2ji;mi&EI_@{v@YQkrDQ7u=kDJ6z=);(!(+t!%nJ3QQ{8sH7nVAag*NXKC|6fPB+%B zsW2W-DA{BFxo-xhIfYM9%n9$Bm||IhZArrB*UQH^(V=yRYJJP2k40-K*vV#t!qvVR z|CGyzXZuu)ajh7y^5F)yNwEC7M9>2p>f=LlX9fZbvj%*HFU<~zGegiSg1nowwe^*# zn?DGTDb#Bw_eFS#Fg5VIi>bYdNh&N4y*rUG*3wN-EzNuJev*OAgdER03jr=AmuLY; z7|v)xmmH)f-SCQ-;tyzda@K6OuH8AQ(q35+Nx5X;K zgN1^7W18L0#jqGvTp$b`0n}b(8qB#v8hoo9O1}<_lVEJDJsVynK9=`fgZadswkYu8 zyPZkXllmM@7xVBKtm)dSE5iLqK1_*Ma(f4uTn$RO7}C3)BbizdWdrsh>vv;yT^(l4ZV>n zJ?# zU*CspP?)2#dnm9w2ayKrqksDl;Us$${>o^{`fYVJk@x1juTY{%%isdezHalpAQI=5i&^Oj#}S5dF;tRD`N_7sg+)95BAYhwx-i!1{tO^x#Vii?YuyjUegvZG7O-V`arg?W;Y}jM z6Nk`I$x$*>PymgSJ9BLNNChdb-jz9deEx6i%4t9f0g~`=O5j+&BLv>Ku8SWSc-6HlLs{F|wE zP;f+85HdvzNrD|%?(%PXnEq9HhKCHHSeag5BC-joT3?q{r$%yWTU0Bl2zJUBL8E0g#^s-p8!Nq-^uv;^s=HC!E)<_Un zgR=#UGZm&84!yK2wo=_VIus<}9U6XN;~CC65aA6SN*;qJB^c{+t3V&JuK;e_MDWMU z-Ju#1U1Y{TT==6u>Q&qUi34*(xKGx^0N;dh7U#IvWf(A#N1)7LYOe!_FOGV?9|d#^ zCYZc6W)-}czK)_VWBS9^QdIG0l}He87B6^?dPqy88%2sql3G=Ehf9Jj-W`}sp|AI| zL)TjNVVQbT;2%A8KvDXcIeFk%rHul}pj{ZdBc*NFC&^9Iu~hh}Ts*~XA~iTYNQ?9o zb=SEafe-TEmSXanZy&Ak$J1@gN=wtjSgYW9J*a`bMaEh=JgCjePqiO0F!hIiG*#s+jkZ(qOQO06AIW zeamu!D9G`=PpxJ_AxB_IQJF9Sxr_#l5%V0Iw4M>2>OCYh^(SrmbGZ@Wt~~EHHG>>uZAkxpo?YyT3gB%l_bc~sNgXA3% zH8*0H7E-gV4;H10@a`p{)$ox$YuJ_eoAtaF+pW*?J*vLIAM1yolh@Y$f3fzK0d0g$ z+b9)ETUuywiWPS#?oiy_9U5GVTYypu6nD4c?ry;)KykMq!6gth=t-aFyyyGg|L12m zyR-M)bKko&*IYAOxT5w_?{!mnZtZdYwoAehg z76wyg0@ifuG(DXfx2HVeCB^&s>_1w4hFqJ*ISw}x^?yz)@BKyrXej(TgthvUv-J&w zBLJ2jTnsl+eWP1^!aH&je7=Cskgw~rO}#eUur%x?7bA^ZC?(`=|eNyLNBl4N}rQbO-(fr6Hcx4cgWkt*f9N=#zl$sUJFwc`t)976B52or%^UsP-YnKFx z+z*o+V3zlHJ~6DGINo2ECESsZwEudN#9}HCJ`9wU5wpMhLS53cfv?^z{#MTK*CK_>`_O! zg0J5Ag#-Q3Nlw69zn&hrBNNW5cVOShFt4ySID8R@?w7$p<84&L9Qk36Lg~(ijJwC{ zY4=SVt?Lg@6xoHx3i`qaftG2kKYcR0>XVRod9?}u>f=`FL(7P9e4eE$T z8XzY1D61&QACu8YSZ4K;!!uAn!iPxbv6rzZ2;7*!SB$C(h!u-2W*2-%FZ z%Bj?HHM4m8lM|+^94|=Thj!`w04Q4r%jDc`KjBJET?i~ z=`veedD=jXNS7og$Srd0PgPGt!QZuYeayag3qe>#90d_6!#t5%em9W=RMzC72x<7) zrn?C_AS znbsyPQu_SKf`$f+khxk$(_OpDapm&CfA$d!5?L6-xgZ~BIoE^?%w*UHd z$65vdCSxj&4!w)|FA5S|gTLyIMf^F|(P)9>vXjxtu8PbwcBsBF&#kZX!@HT9ELqp| zJAAaNmCJ_Tbj`bW+m^&=)FwxcWFm3PSK3d1pUx5EkiRg0KMZsce@eg&nMgCvT{18( z((nGV4RN}Q?@aJck{ohv^8cb1brGI^OR+%ph3Ips6ORN~-CV4T)J?06RrQS|;~4sU zDaKui1A17$?u=PpE^vhSk*6*ZazcqnwH8o*2KSAu;0zGR9IQ#dMW#4f(b)}G7d@Tr zNJ_8{hZfa@GECIDLkitE$SMS*5$o`^=tXHMMEuV{)&U}2a#x!@7NZV+YJ(k zIEPm#ctjiY6wSpN91=0m9J4rHO#(OcWWzGVeE~KZaaFI4@g)O3w1v2JHNlsC8IZ_G zT_Vl*?DHYx1VYOcB!s-@+8)xJgclvW{;QALvUd?{;seUz*}y&E;6TiXj#Js2r6*)tav z$-vyt%6J=-DXN_7J8#!l#fN!{hXkBY4~`9sOZwaT z9pYNSSDO6as|r`{OY>*GsK_P9!S(D(&VZ_8#P7TK z2wdo@#MZVXuf%?zDyZFfx@_rchV1+~&lRW+7k4mduA&iIF4?34v=*5UXR=TQs=a;U2W|GyJd+E^fU@KwOnWnC1pvaM@N-=TTM&SgMqiTffn=z2K{ z5@I%0kp9k4@w6#+@PV@sA*0`owXJU#WH?KzL56d`7~)ET+{>YgpRq|4A5HVBUfYO! zGnBa=gqi$JXz`45l+|5HyfjT358F;Xq9$i%A8`?>fJrww`6+z`DetwNu(Y+gSLk`% z|0pgij9`g-MV(TXezbbt^(O z508zIPBY|#jfOkGy7pCx1I-osiHH@Vw2w*YXO^^bEjs+~*bns(J=*?2U~+g>^%%*=9c+XkR3yc`FT7hm8N8HdfDPY>S8P>|>fbz=_d58)gLf z%5mJsN(OO`CqUO3^X$KOsZbxzti73r;a9zPbdcV%aX{IOqK61BGDWR->s25BbmaUm zteJRO%4QSG>k+;y3kfnd9^cZ|)G|+oM$z&XbWju<~za7i;<9iHP3%0rBwX<9yev=0_OcEW82RQ4_$)DZSGEU zani!8;_}AOny4p59t-N&iB7LLFH9*?Tid;M+LeY)-;ojmP?Z3|HUBcr4iN$P8 zX3qfq%2`<>rOa0_D?B!Q^O|eHsMV^k9psQQiyB3fqTP=ruoFjir{0DsS`0_ugP_ic z7CQNkT{eqy{hMOyfBRM5{<)73X3j&2nH4MYJR7CL^mn0{AR{S)4)e8hUaJ<@=x&sM zs?@zXx)%JMJznIJnOv2~6)7-`y@ka$EM6H*TQ}-MYN(WS>gW98pO-R9Qw=%)kavCe zs>J6^&pdNdJR`UaUR!)%!n#lIUGCqm5hA|5w5Ar_PWqOii|EQ+eMw7Li>>*Hd}L(c z{kPY18m*n-x~8$$f&&u<0gL)k`0$Qzc#ALHfuu(5ZT}g;mTf^)K`>U~rRjj}_kcP( zg^y(pU?HcmUk4Uog}M(YIf*7ElE0-3+pIrFQ)F{hr6}H|o$EY5CLr~q{x|bj>0MFd zd5&v_?c)|{*?e((G0B0QykUxS{G`L9irQs%>oRXSn{{`Nb52KiH(ccplqif+gL~rn zqCBVR8FVcYb}D#(y}Sf@_+ySIQm>ucD7rg7M`A%q3ooy{Nr$;TBAGlx5e)~*h3-6d zJh9hLTF%FFE2bEaKS{RRM_HRC2E#WHY3qEu%iAY3+9PhDI!tdRAi`xXka9#J{2ZV98% z_UNp!ZC>}HJWTn$ItVNF!*iHdT!hW$BAOd6a>MAFV0&J9)$@lHPw72+vs{D~?K3YP zpEZmf*Rx}ONq>EAt4R`BFF7JbsD>rK8(Y4>ol-6jCwI+Vj+c3l)m$5c_N>2@$(cV# zw^bd3%6QdU$1$MO5d9zG7lX zk$;5^z#^1wSk=8xtD@J-V)Cu?YszHT&?VV?nVq^}W1%;-iMnAoYc41VjX`gZrZ>#! z_s3!&Bx+?Aptzp8yU~`L$(U}UNU*)kH=mLHQm7!ovkur~bY3d${f5l2Y^^|wNNAdb zTZe4ntI{sRi>S=x3oQkTB5cnC2O!?ed!gl(nii_ElY}QHqe)V$> z;Q{kldz`0i4C_}VW-7(nuLCNlH$L1>8pXHm+nN7R$5r$#4i8Ws_GoF?BK4T_%5hZ8 zxna0{9~{hBb;9A-f{aPjbz6Z?uMnUfWKdx^mY0UZo38L8KHzKSts+4lKjErR0kEu< z;@EEh+RivXjDGS#M1URLtq^>t79rcRU+TTbzaJ{k_tfXo;Lm!7b?oL_ybL-`?$uVG zYxOX_sH^kiDDC!WJJ-^8k^+vOqcqjJA|AK1vwVY+{w2kccjQfD@?e4()ki^YA6oVJ z%d27>V{Juw@s|-4+C_|UCBE%=3HAa_ z%^{;OWonM>Eau7io7$rQh~>XfD*Ak4u<#gJr#3#^BJCQEF?|^X(#D7snf|pHt*K=W8rm9P~8!r-JE)_{#NQkAS z*#BmbH9MPm8vo{EgKRDazUlQZ*4-9tO}2*&w#B*y(W@h9M4Tur1YI4fTa12uhn(p# z|ES0qj92yN*-3_gn4=)VeqWwQ4}24m;bvQgbnaLf4KUj?%at|~GtB%0H=oDIx z#78^fOUVoAgoHocJ+-b`Mh%l z9&(TXb?8V|K|R|)m0U=P+i$;4KJn%c8i45cF2Rl*<>wD@40z-@%F}Ijw(m=yZ%Bc{ zyP_wQn{|8Z8_;68p>zL*HK>#@%lYbnW|3X1Tn@^dR4$oo#7G>{Pp2r(o zw&c-nAv9QD=jFizPcw#aL zX*Q%iDUps95>>PdvJwb5T<=|w!(KpQj6ac7I^>^PzZ;J<<11^k!K zhbfMEB7_)!LKv2j!H*X4KV+c&f3hk}j*){CL>sz(@!ynKV~@zw`Tupu^wnDKP(*+m zAp1W`L5Vm{@gyIcI>}Avgad#gACM$vSeFRVi$n1L{Jf^k`xR21b%a&lC;gMqbNvMB z&+%$5-WNFmR7l?P$(Vl*(o9}gsy{4I5Po$34!v7eG7(y&je$v@k8zx?5sr50e|mt( zbUe+8-0^l{dxis>mUv7n@aNu)T%fkGld+&ILq%j^cOpXn{%&T1v~+M1Z$?}avHbNJ zO+%-&0XC!9VVOTiKU55m{;mqTH23**nBy{^L3go*Tj$Z6*o$fsf1wI86{i zBpj=16~!DuU7SHyaJJnS(Ls<3#K^NBLcq>0rg7^An7f7-r*G#ClN{Z)m-nyb?^=d6 zGzvn>STv&Mm8!L*mfw`q`at7GcM5qq&+%(lS}KVzRqP=4i?OsTqDgaODYY~loNS8& zXr0?#oh*j9AQ`3XikkWc`D;@mfWcp_qtC5+OiLY*(W(tfZe7jhn_8V{8qzfe4ZY#v zayU_gT6`|0J%eP?+!;*}{eF3~1v6I}mbzM3+8!+;_|nYZAPzPF4Kd{=*ej^sSIARG zR0>SQ?@ksoR6%drmQa0(!o9M@q)GOHpCB zr4Rmz{_R7miKbhhAr7T8je6znM3kS6V5yy+VG3i22ZB4W+dOMYkMZubsctS&U%gfDXP4RZ$9ouW9&%$o*6 zXLsX6<#K`PhLYrsEN;Vq`#LOvEta|CUB_*uC=3PH7jc@=IXw6y2!A5_NUbjbN zkwuMHzvDW8G4zWSr5t?-$*uC=3S7F3=PeXEQw6z9Tld5b@P8DUTh;(sswvgk*26w5 zi)(0e5rXaM7D+~$lJ(Xwua6=H)t$A@Y`nW(B2IcH55)K8EvX=&PnG0?$?Q zXK*p4Hv#@#p8|legma#?FqZjgPnqoBr*shaPdtQe2ej}^T3@|d|1a)o7V*PP=$B+* z54Mwm$Zg!qI!zuSjof3O@`V1!ugWjdqfRI0O5ZdngqNX?NjzVt`3w=*zK?WN>o9|( z!$|<9qnF%?j_E@@@2&O9+LEDd8QSwHG{Gw=!`enpcV_u#5kSoQ)T<F0nZy+)E8DVJnWbyE<~I;sPRAD046W z4X*8M(F~qtVWb6(r4{xs+4>VNeDoz5#+@m#{*I7IMnONx(RrJ39L(c&TOzJi3%o8= zWoHz{N%1~5)TTvN9u)ZQVk43&es28%bh;3Z*;$)t)u2j`1T>Pg;EI8|`TtqQg=cMy zSm`H5Qzjdyd2P1hat>RkXgjaPxgF5oD?>@PyQ(KC$U1ny)yOmC2w7TK*1p9joD@RS zR?4<=w}Tt_q*btO4f>u=dWnS2$2*)%moP|v5fS_NkGPJcHMwiJEdI59Wup0SIkymH z50s`I(d9!5KVwUmz2D|9ISJaZ#(uhUIhIsytUu-&fO|PB3Bw;JA zjz(AZQXbak5II&yXhCbHE}c!7+o~pxJc$%oHqV*4Fr}rRcZK~x>N3A_b7ZgSV0TZp7M0ofV~RiRM0LHpYtHT>`D6xnBIesI$5z9?fO#^_4OPFT z@Obfj{m*sIjHJ7())+)=2WlcN%W6=85sdmYVxL8()P0kz2#hA?D!9K!#o)2)!E53N z*8(g>j8`Q=@!CyT7I|ZZaosg3Q2ls#V~yF4hl)#p=gOncr?%fS0;qB^D? zjEj4j+OyFcWvysdO5F-24_a!=Hlvp@jwo|u+gb+cDN$h!mN?wAit#j_-ulzQi=J-1 zZz_rP$vi6!Uzv&ysW56K2{Wul2@rjUon$vPJVv&W+>NLIvwBf9w2e)X`jZ@qkpO2r z;f@SQfL>R*jHoIHG!>_f6QYi29OG<^hOeep9`Fi<&m5641+vB}yk(8DI6*!X^Zm|f zc@`M>wkQ!SbRH~dRdxCBM>@;4c-!@)y#sgp`GSUUq9tDOP1~1U4lNXrw3pTw-2_1u zS(~V{<^3}+%qO{oXmgKmQ(j!jfEL9JV{V-7a>eJk1{{`0stdp{bK^qnPvi!Qte$7+P*gF{yx?z# zA*vA}MWr-PDu{t7EB#0~qH8@Cm|Q!mP!{=8`XFEBDHk}+&)l88N)GOfljt3ZidvTe zSqB_M-Tjq@3&~U<2%fd^tz{a0o4@W%~nS2?91)6M!a zf(yL!ihtZRC0s6{ygCsi)Yd=9hg>FVtxPaVl*IV(%#+wiDA@76mLU9bjt(Af43P3= zal7j)YD826s_okntjP%GTRg;xr!}9;bCE{h!L3Qtuk$1@C0g!3)J|6ehaSg1cj0_{ zfu(2&dI^Mk#oXmYeo_T~izLWrh-ZP9HOiz!FPGnk>s)QEg`ACWP|8d~dGRwAY8SF^ zg(Lmd>yrf@gy8BCGO3E!enmz2ZDNcF>NBiqOFd#w)o68@d-iKqGFUKNm=GYO$h<{G zed3zGf3l|i5i-_7q^#>?bG2t}o8G)oKOLV$?6he(?W-qK_aHtCpw+uZ9F2yYiE{ti z+c^0mWMi$;nno7Uj8Rh;U3%_`j>DWi(saUcrky+0U~c@mfKi=1|MLPT5AWytIvj2K zNtqAOwxsc?h%gCIc)l`;tNNw6B@2dO7~`OsStk>umkYri50v*nnf-XH56+ho78+yp7cF8U32k&8b=75W z*u8qS)E6G~6b9$00yAU}DglT&HZ%VLp_>u$!FE7x$F}oncRIdY1yY+L zX(bk}#6kFKlyX(xY6&jbgO}2OgDbn#^MtUOaiPI^m;8ZB#US zu`b{hMKW?tPDqqO)+?CjfLW4S*$t4cBq06ix-*6KJ;Lu8N z<85&fP5M;8?Xcz5$z*aNA9k%Ad&#tWr0vwC^i#nZW)jjF+G0B~^Eix$?PaS^L}v4i z#T5B`kx8kEiG0PzTqW!E~~*>!^H{N*wDWCcz$imeS5U~nf0Ry-aQ{J!vh zQR8=lnlwPubEx&pcU~!slmjK+SlGtMt(J0k(<){tkzICn94w-O#)%hJAhgOfj2;iOcb>fxYWbVp zMtH=gF`6l5ey9q<+~Fhy#=oT6gR3K03Ugxgm`-ky-u28pea3G=X$DjwA7@p=3JcIY>@m3rEt>o{aj4?=;Hm%GSX zNcp{f;a{6_=wM_w^Y9mO{0A86wPT*78h%Lk2g}TrhJN@_NY|V7 zXLe0R1{o742KhV7pKZJgF>r$A0BJyziIc$o7!`csWc z^FXtXOe5)D_N3zPLZLCTzRUIgPCDfm1N;Kx7!~kbMjyC}vwbluar))zCAqxI6yqU} zV{d#>#oJD!jBm~!bNX+8RG=W%KF3kgwN7N>pH{4TwzcDwqqKAA+xCC$iOtV=uhW)F z1UZ-ZqM6a!ic;;wJsl=WMG#2a>Ozj$+SRA>&az-dl^EMBn&x3Z^|?tUeeQeWIKU2Khuq5TTH;|$U4FQNhZLon;OA|J8mMmUT}5SF<#H@A z-QlyoqzEd#%>f7*F5A<41}SY}kuHD68Eh(6c6cE26MK!0{MomjG?!Lk*k~>os?&Y~cQ8r#_Z{_e>0HUxDlUiCM4eE0`{;2?A zDyE<0x>KN$aoD=9C-=4rDk5UfZ^CWB=_{fS!j4Z6!uaBjW~<>nhWBi=%LY8~=WA6w ztlb1x!J>|2gy%ASgpI~P#!WM0$YbcMz^HfYxuL8f+YKG8DV8&2s=`uTex!hIJIB9v zrshf=L0TBya7e}0)kR&*-VbJ=UMx%-;UTd|dessI|EudGrnJWPnL#9CXb7;~Pa)NJ zxXthmi+h}v;h3Wp66oRt-(H*yk{sgMbw=9ldjK#^^g(VbRE2tHqXk%pOIb8#_j+Qy zqK|Z>3B%8}?(ytiW%F8Ws=0cl+ORkHkL%Q9+K2tI7J}N7a^O7eRydoX6RGWFju$=$ zVs5L37~Et)&pHqf)SfmNh>=TG?BwDPdr9c2{*;X)=9$Pip>$hyL; zN+^HbF%}hf)hZ4R_5Wj$ziRr;OA=rqWR*711A@?{(z}w0n z0s}`bbhVzMAbWO2$!$%+kQ)N0ybj=%-CGJl`BMU@{;;c)K#gyF_&0*xcLSf6hG6NK zT%f>=z7-#WjRB9|pi5-e7hfYC*wHW&h-0K4wJ2Hp3=z!OGjLG_9XCfT>5tx2?7~-R z!OMgAOIt@aY;$#)vX&H(x|j)Wj3T2Ln3-TLwK|Y|H-@z1efF^eXq-4O6Q13c3M8vn zn4MFnj(g(^{k#|s7`v5r5stdLB+6cVZ7g_b&>zL`*HXF&DJ07vvS>-%BWLv!gI47o zr-&_Fm9l~-X-8H+%#VB`Ir-_&eyi8O%ichU>i4&>ZR$3y%vCSBN*_}R7~);FoKa!u z^kl!nMPj{@DDs18sHyR6f`gyaM?RPiCF{+IN67^dv@IqWrnKG0OAD!)7d#hy6C4J> z{;34Kv)Z{=h23d@rcIo2f{eM5RXBg4OzL6Oh5Sjd8?PNNiV!}kK^4OjUmp$Tz={&! zavqxcT=KloG9(fz>OA+$>{OvN@d1uWKUBt)~30 ze=EiI;sOiu`iWrg;T2-}c}OfQ;!9StqtTa_2#ml;Ca=LcZ7#Qcj!zO1Z^1o23=kJc zWY4?6h1Ky9k3>Nte)@&XSQud0!*Aef)v(bQy+g*G0YZyCQP2O&x80IVLCTZF~e z&l^@_Ahg4UjatXv1KpKulV!3G6f8NGYdrEhSkZ@qzU~1Zo&o*nu!igQwRYVoAKbKW;zeLc^mIq`ff|*G z#NO^VI$RGiCKVD+kD7C1d$GH%FQHX4%cG;_d{P@^P}`beqCgaZOIVBohB`7SzkoIzp5yzk1wq^kmXRGQ;^Ih((%wq$H` zk$izNPM)&&&!3_q#xS}&X(Qe%-$jQog)`i$zV!d&`qV)@cXK;4{HPayBRNa1NUJ9B z12;yNY>Rmyeya2OHY-UUSj}+o@Jq0U7XH@{GBeNC?$pgPHznhn&3xE);nrZbd(xiJ zWPTzZ>*F3r#N!=)k>`s;WF5JW}N&dgUXC9UcVp-qw*A&9qRO`4(n zo@WF4XC>BH9%MqTQUe>23vQ7mzk3DbQ?&qnmK^O*P*yOH$mgXqx56(aZK};(;>D>x zkS#yV%{r>_*Is7f_$omAGzFiPstJ5{9h7RBH42DZN>xfb;|O%<;Amo7l)8>Zyh>S4 zs%}d`eDj9|rOSHTzB~gYfTH>7{e8voRNZ@NK_$H~G*Uy~+q1XAb>}3}8>96j@;o5SrKHxom#a~E1dA6E zm@FBS#=WzXZ_6L($Sl{6M%zV#^yE|o zPRi20Wxag*WmMW^+CZ(nHaAR6U|zDbbSs8sOwFq}BkdF0%S-1!RQH=1r@CY$l>^+aB|$gB09V$yPD? z{mBPlDP^72O?&MzKU**Ij0bjlX^#sp^N?QWZoUF#BCHD9RWrcz-yc~@Xh2j2C&FwQ z0MA$})+!Cnm<_GWxCYs``a6*yv{!maJ>OO6fJu2nF3C!%z%Z#Ag+W$lyMaIt?lfniE6u;^}gi&0FKq$9A;pU#_g(gxYl z78f(Y@znL!p>mA}En*fhitRpDVtm`G^xz0r4M$1o;C_G4BjC)Bz zzPz`y!LVJ`AOT$|=_~i?H|a{qr_(uN+z+~oVNHW**EPXo8h*XT{cL_ zK8zq;;x>I3!hq+#mFcCcHolvQVzPxy7Dy$@zog?aNHcD(CPMZYUwk0qKkU zKt~qD=s%3bcQ}$U0eh;y74neAVj$sNeqd%Q0++l@pYXq*70(^nOa_Wxo z8_cKT33ax2WcEUry}&C$*pc&_5AmR^x0tb`Q(_9gHkFg=K}SXi(ztE139CZpH1fuD zEVJc38ejjNm8#=1Tt1HwW30?(hInwLU+lLSm+Fcm!FBX?TBy(Whi8@a{gNnH6@Qigu!x$%PLQ=l zPF%v%K~S8B68-yoX}K~jWi}c8toao#CpFF>{$unc$CD!Uuy0{zif(G3mJ8z-hz3*$ z-UHx2pU7p46d!6_SFLG*KzQ^UU3pYNU8sIT}1)1eUBRTIwAUQD8FGdTvRoVSIbqA@H^?bBQK zzO*55NP3Lr$|;o%WOB}$LVjfYZ4})>biPW)S_pPois+0qqZ(X zk%Q>P_b+O7Rp9H526;7PH0W7aypJ<1zOs6{($43xGY0of&z?$GRD+$oLhXXT7Ek7I zAZ`_mb!5(=;6>MBdGj%TO@lRnmRC2Ni%s&BDvZ0&yK?X%i299{%qPhAQr*yqK`?pX z9~!u!9LkwXACimQkcw<70;^xE%ILTRO0y>;O@CfafmntB<@f1R{Z-mqAXt# z*q|O%U~7-sBL+5S>A$4yJ<46k%Y7|ADfV@Y2f1`_5|AzShFvPvVx~QJi}5D{vzX z&>G;(KTZcB7D*~2=^0jhZ&``#OAe~?h;j7sol*|%MycjEqJ-E-{BnJ*mCEk+c07EN z8!%m+`D{6&toK`zUQO1!{J~i>NUEh6hCh9)RqxxAfuHJOm2BDbs%d?vR4tz&Y>P>m zM}qD4RmmAP%>ZiCMrq$<-Tx-E?ND zN@D;paQ5siqOiOvutDh;`_|1Uz!?t<(xbPWuTKp(9I{cHv6JKT`Lm`eYhP6OsJ#2! za{Eq2JM$eo|Mah)$mShUW#^M3r@6pEjV3^CFp{&2c|_$u@6G44^{`!yYN#w1W)gPl zN@`#!2WA_`JZ0ZP*Vr!q&Af^*Vxe;6ZaEvO0QW3wS#l!In`6Kdu4 zx^gRQ{TXh0W5c%74s6=Yj|;+J<1yyWnxN>%+hJ=ss=ua06e^<<>KiP*7mkAkyMB|= zF0yoCfUsmNhn|`72Qd)-Tz!WRZDaq&!Y+9MCl|?C9}q`i4!w|;N&%1mX2EEqJOi}l zehBGzo#&%xi)5VuRBeyGQyg3lVzP*2RFkHZjoE6Y&s3Ht{R8bjnD@BAljyA35~~A@ zZn7Qp*0*hRv7azbL1_-Cn1A{29!6+z-o`$!rZ6IhWa?xPsTvw_nhVWLSzRK%N^`1e zOt`%{fbwpiaV}Kw@`#rb(jKRtv+O$jHx{Vue%{P4lny{db)0Ob0j1R+3!6Vf24)F` z)cT*|9b~|n8*wZNlp(o+X2gJbV*0e2!>9t|-+e^CMc$J3pQdxKiW=43OR=D1%Af7> zK@Goime%J&LaIcm!JL7A0}r&J`w3*cg; z&2ju*N#E6r0a77j|D}ZvuK&(~rquet1-?3x+i1H4SiW}UR&(*JfyGy5ydR64rZ+S7 zki7PLW-IP?WxB+H_{ER7d+YADP{&%qti2sF zB;Xy{{uwgU5+*da=9+k{ur0vd;UCFz4wM-U0tM(oB+vg2SXkOYBI%tW=Y+tLh@+T9 zj;Y8Z3xb)Rxadaag;A`?DUkjsQT3hCtiJVNLi!wr;#8U3w*I?g@j<;0S}wl!&fIDo z$|AX{vn8D@MS=oB0p$gcU19ItddAmoPJu21HR1G#EvlUN7o1i(0Gb=?1ic7acZ(lZ zyc<5V@%d_48>h8-`OibU)xqfM)Z2_kS_)ONsvt%_;lak5JTY%?A%1=-Yz(Ck5fIVG zK$J+!G9;m>pdH8Q`3AfuKV5*FoozE!4=GSbR0S>yqyM3{EKohI&mOUvYLL9cBT{jVCjC0a@sro zT>QoC$e>KDlb;~UP)*}i{I(lSso;i!lHtl4w429uYI@Hq0Tthlo@F?@eJ*qO+B#>S zz|A@0AOd(;t!Ivglqd90o^1zgv>H=o4s2H{7Gf&wL^u`1yX~a<>AhCBlgiJqf+Q<; z+ZZDYsdgoGAx)C;3RCS|x^d{U{GD!RgH!Zus6g|ZS@y{vC6%OkbWN^HnF`lV-~&1c zku&3SfoJdz^-3@grjB+&^U28GsU|k`N4(6u<5n8 z(=*|U>fmBWj^HNi)x7}ha0U#?Y3=-~QnROA@{*cRjAfctx;}i{fy@YU^#!hcnXRHz zji0H&^J#OVy<4{7lTWFf#?|U-KD=>`vm`-kuT0KdevTkN?S@M%D&LSP}6?rJ@l+QD6)e~j80AV7BVAOFtXD9p>*A-*lR!BV;^-MzeO zth_lT1YPxJ{5j&4$#HwmwXd;IIbDp&%}Oh$-Yna^4+MXkr~_7q%<#GN_uCD|)f*m?0}2edWkCz>@LvU6f|yLs@$~&l{wp z-f;f&csbRNV9T3vSv5;M7G@yEb-n_?5S|>)FdyR?x3;c#;-WR(u>Nj1Rp3BP?<@Rj zYNtR}Z^=d>p^S>De#Wv;j*>VqgOt?>4quh! zPSY#IZ5wuF`|fOwCd*~~otNVAaO8Uraa4@x8@q>)pU(x7L&4OP{iL38wO4N1-Ii2a z<2bLE97Vm3^cQzcsN=uIzzS1y(dma>^$9~oB=o^qd>;lsIPb~CG5H(#Q447u#Dok88HUlc7iTVD zwVvcsf;5(2>3I{Ma3z`TZxYEFmBi&?Vc7|=IEFY@H4%t$!nUK`c(nKmcOI?NLP&~-8T7A5RmudMr7E5!wF7@;{Bph)>93}CV^ z^ASF^#6|*St|>;o^3*i`BqpXdFubBvWUoJpLoSps%JlKxxP>htkQiuO0}aI!$@KxM z^{3ot_NlQOYEo<^=sT|;DD;mA#OmY8u{&Jsr#o``RbL{%c;cd2*QbLM%{lCmG{@;c z?7ZWglfqmW$3WMzQ#|V{ldO3xDExFc>=E~5v9x*f+9BDo<2)U;OV=m?Mz@%E;5^5-&8R(ai=ksOH0YhP-UA7 zzT-WVI^oi5^L%`;V@5R;VR3>JfD^sn(K{>tv- z8jcUq7?(<8X3YI|7WO=-nG`de($^9_qVQySx|gX;ob-1)QoYfJ&lONuR56S%g)xrniCMK zaeB;?RHvVtzxugW+mdf>=jGro(0UC-o3b_*1jgo-oi>lWQ5Sx&d(Bw_b5BMRK$^Je z45Pa>h~x~v!}J8FivyLd&#v$qOZBOpB_`|gnreRUUeb*yO(wXW?mSF*kqSMED|G8I zaB*mSziTVjjRotL^tVm9>>a;T@83f1uKtSXK4qeT%h1OGu5uFN!zoTZ!c% zY6N|Ycwx6{gqNL+b}gtdkK~r1ld59dN@Bc0BzU|@v%V;;4xnngytXdrXjV7|cUkf^ z_+T1KCBGP38IIgiVh+$RdEmim&y24|wuUu{9O!DIx7>Kr#G|pu&EaKxWn2RDd1rX+ z5PA9KtEo(rU0}%`LtDI`Gi84Xck#VuaGWuubGPRy++2C9U$PI`R6Bk*qp2ZmR+mPf zxyoy#B-L7Xw|#iIdHDA1qcV4=;!SCB_XO`u>w9y`doDfvm8}(u2#Q%G-*Ple^&92e zr;p)i7xMI@*RyMWYpScXOj){Jd1l zSgZHJ<3r8G192YOJbzbftB;2if8bl8s4OoHy}tL!8-5MUxxCZkYDzjeLRW?7<+34? zu}n@#N^N*X!G{#r^wIK;My9qsz$ZnhT#)x!A{_gJsl8ekp)j}Jx8zgP(GPADtFjVt z+(Bfo#mLKuk~7!}hYr_O9_o9yyv|F%K0b^`s(nAAnY{N9IcDHEnGu-S=G8vcYDz$6 zic}I%1Sg-SP1J3)($u#pSSRw6E@D%I?WjT|`-*~}*!mUe9ny-k#v+RDk@F~BfLp1- zI1^Dp-?Z=Ly2GsCns!v5bz##@lq&E#Iv=}J6A(O3O7&H(z5>i1jhoAslsL_PsC4fb z5x6AvX5`p}Zq4l`jGt+13kVI%&m0M{cADVXzj;^)ZyUev*^Lh>0f%0;`WOcQ*+7zm z72^k9mEIbVrKu%+0`hc|?f;=pz2)wPNHIKZ$+bJ-`G+NJ$=A8P{eVfw`(0{&T+KFG z7cm}Z4XgsnJbV~{F#|gT6A#23b>A6y;dcwtCYm#L$|}{nH!OlCu}5C=f6aTgQ(N!h zmHizN7VTYcIV0uKyPTRoKeBaH%`dAmGHqoR7pS6Z)zAQvFElBHA@E!A`^0k*YgwDX zO;)wL(k%$j{Y@ zn!s0kX>!6-(yYvj`iNV_mi#7yUxFj5R5v=}gGiTBXRQXwU(Y%CZYMElJQ4=19DVvP^6SpJ~c zw{1ISh*_Iw>l@}R$|T6n)gO0ZKqs(YBtqvFNiD4{JEXyN*+S%TrMf#&OxW*j(ZcsK zyVH|R3z#-6o1@@L=H4(fdnQ%g#wwc}!$}OqB0=Ut@JVQB13aLgO*I^gX1lHJmvcs2 ziSQRzp}~RyaC3Z&rA$EvoEOeX%QH|Z6X|fWDWWSeyG3hU zY-N%7jx=R;{9^-N>8oz&3R{=)_Z-)mH3wosYy{D1q4VPzLBr+hBoXg3sHZmJ`9X=GCGOJTP zihz@qf)7DI=qU&|S|V6&N1!sC*;1ju5gIMHMnYjRY1Q<`C9`YFKb*f%p@T`WBW&x} z2`uZLji1O4cVAJ7b{~b?IO5bFBG+(Kb?mgjNRI;+G9jU>Jv+3liGLRrhgvebXLux0 z9#6jy-5U_Oce18j1bvhCWcj!Qg51)koe{ZyV*{jU4DDm@@@84FmI;HnGM=!HnvEH? z+722M^H(_vzURB1Un&=6esaSEny!bUr7Y3x1E3bF>o48e17_)WIt6ZPClSrSDAvM5 z$`E>m;-^iY``~cdgo8{?L&#zb5r}_mmD_y~3s~m$)-jgId&t;Gk=;+G5=TtfN>8YO zvFd$dPhy|<5s}$+j;NQ_4wOuELKI&|ERm{R5T#6=kN2!zIdX*#XIt;;LY|n;Kw-6&dZj-7_{F~8rc=%8Ie6|$#dz`#QYc9p){r9^Pjv2)kD6X?( zi_)0=7dj=}4+TKd%z9X;1ZTjO;;j>{!(Okexu{sW0xWUKGT_ji1v`KA-;k4l%}ZECHta5j~%FQa&7=z!%^5 zt%5cR(LEU&XifXS^~t%yA%wZ6Js9}q3+;Qb+ec~;(*}U7 zOD^IP{7wk*trA-Y*|Q_ETAs9_e(ixWA!2jRl-FB4mQ>)H>?r}9E5Jzs!mf;CB$`8? zBprI`uEy2FM5z8L$Hbg^W$o6%r|y5qS;}uDQO#M)gc$9Gz2OZQmLq#6UrY9!j^FXc zx-!+x*86b_>(+Z8pl;sD_OPX-wimJkdi%@}p<(v+1{nQKG=;*UD~m$!Df$(QkwxWHNH=Q-&{<38w+i8U;l$7PNg=60#{zzHxvDvC5&b5fG@7F7Ug*}u?*WRe zSb&-Mj3@fZKOZQ;S4kg|PMc$7>=0iHHZ zHA%Q+RyvdqNbBOCV5RJt@2CUbtkQ+TorzH^Lah}pDHTQMFJPGezZb#NMS{N&Rq33k zq!py>{tIon$VLZd=5}V|F+BCb?ejn`vkessU|XXG=ifl?fc$4&U$A*^D@y8k>L5*d z9;GFVa99ajI8el0oNXhy75i@(sLhS?#|USJ#so8TB@VMIo~8sfd7c5rLNeR|COA!NkbKWVf;LZWD`>#>SQ= s7M8|l21Z7fMn>Ha)OY-?!ewLth&uo86>^)4(&xMF Date: Fri, 9 Dec 2022 13:45:06 +0000 Subject: [PATCH 02/49] Fixed spacing --- reader-privacy.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index fcc6bde..13e30b8 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -2,8 +2,9 @@ ​ ![wip](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) ​ + **Author(s)**: -​ + ​ - [Andrew Gillis](https://github.com/gammazero) @@ -11,6 +12,7 @@ - [Masih Derkani](https://github.com/masih) - [Will Scott](https://github.com/willscott) ​ + **Maintainer(s)**: ​ - [Andrew Gillis](https://github.com/gammazero) @@ -21,7 +23,7 @@ * * * ​ **Abstract** -​ + The lookup APIs provided by IPNI nodes are able to observe what data is being accessed by the clients. This is true regardless of whether the data itself is public or not. Because IPNI nodes continuously catalogue the content hosted by all the providers, and provide a central lookup API the need for @@ -35,7 +37,7 @@ to IPNI in order to preserve the reader's privacy while continuing to facilitate provider lookup. ​ ## Table of Contents -​ + - [Introduction](#introduction) - [Background](#background) - [Specification](#specification) From b84ea7afeccadb0744bc87ef96f48159190c3b5e Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 9 Dec 2022 13:46:52 +0000 Subject: [PATCH 03/49] Fixed spacing --- reader-privacy.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 13e30b8..ebbda8b 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -4,7 +4,6 @@ ​ **Author(s)**: - ​ - [Andrew Gillis](https://github.com/gammazero) @@ -108,6 +107,8 @@ by re-ingesting advertisements chain from each publisher so that they can use it Doing that will require significant investment into infrastructure and will be eliminated as a possibility after *Writer's Privacy* upgrade. Reader's Privacy is a first step towards fully private content routing protocol. + +Wider security implications are discussed in the IPFS Reader's Privacy specification: TODO link here. ​ ## Related Resources ​ From 9c3c06ea0acb6ea8f363f646a3d9d44068f12bfa Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:21:49 +0000 Subject: [PATCH 04/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index ebbda8b..580ff76 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -59,7 +59,7 @@ fully private IPNI protocol that will eliminate indexers as centralised observer ### Non Goals -* Writer's (publisher's) Privacy, which is going to be done as a separate specification; +* Writer, i.e. content provider or publisher, Privacy, which will be done in a separate specification * Client to Provider privacy, that is out of scope for the content routing subsystem. ​ ## Background From 8565242896f6e8b9fa7a0ad6ff6f642e6011ac6b Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:22:12 +0000 Subject: [PATCH 05/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 580ff76..7ebaaac 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -60,7 +60,7 @@ fully private IPNI protocol that will eliminate indexers as centralised observer ### Non Goals * Writer, i.e. content provider or publisher, Privacy, which will be done in a separate specification -* Client to Provider privacy, that is out of scope for the content routing subsystem. +* Retrieval Privacy, which is out of scope for the content routing subsystem. ​ ## Background ​ From 02f305b403c55854aa556c60402f720ca63a7cff Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:22:31 +0000 Subject: [PATCH 06/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 7ebaaac..b58309c 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -73,8 +73,7 @@ on the diagram below: ## Specification ​ -This specification focuses on improving the **step #3** where a client has to pass a CID to the indexer *in open* -to get a list of providers where the content can be fetched from. +This specification improves the reader privacy by proposing changes to the Step 3, depicted above, where the client supplies the content CID directly in order to lookup its corresponding providers. In order to protect the reader's privacy the proposal is to change the way how CID lookup works to the following: From d184ef12b9a997775a96cced9a28f5ce05f61889 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:22:40 +0000 Subject: [PATCH 07/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index b58309c..1208846 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -75,7 +75,7 @@ on the diagram below: ​ This specification improves the reader privacy by proposing changes to the Step 3, depicted above, where the client supplies the content CID directly in order to lookup its corresponding providers. -In order to protect the reader's privacy the proposal is to change the way how CID lookup works to the following: +In order to protect the reader's privacy the proposal changes the way CID lookup works to the following: * A client who wants to do a lookup will calculate a hash over the CID (`hash(CID)`) and use it for the lookup query (hence the name double hashing); From f070287cb38bef0d436970a8002a90d36ac6feca Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:22:53 +0000 Subject: [PATCH 08/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 1208846..10caaa2 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -82,7 +82,7 @@ lookup query (hence the name double hashing); * In response to the hashed find request, the indexer will return a set of encrypted `ProviderRecordKey`s. `ProviderRecordKey` will consist of two concatenated hashes - one over `peerID` and the other over `contextID`. Each `ProviderRecordKey` will be encrypted with a key derived from the *original* CID value: -`enc(hash(peerID) || hash(contextID), CID)`, where `hash` is a hash over the value, `||` is concatenation +`enc(hash(peerID) || hash(contextID), CID)`, where `hash` is a hash over the value, and `||` is concatenation and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need to get hold of the original CID that isn't revealed during the communication round; * Using the original CID, the client would decrypt `ProviderRecordKey`s and then calculate another hash From 7cd7d9c84966143169401501bd4e37bb7beba2b8 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:23:04 +0000 Subject: [PATCH 09/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 10caaa2..a9cb179 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -101,7 +101,7 @@ however that CID is never revealed. ​ Security model of the Reader's Privacy proposal boils down to inability to *algorithmically* derive the original CID value for a `hash(CID)` that is used for IPNI lookups. Right now indexer advertisments are not encrypted, but authenticated and contain plain CID values in them. -That is going to change once *Writer's Privacy* is implemented. Before that a sophisticated attacker could build a map of `hash(CID) -> CID` +That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(CID) -> CID` by re-ingesting advertisements chain from each publisher so that they can use it to decrypt the protocol. Doing that will require significant investment into infrastructure and will be eliminated as a possibility after *Writer's Privacy* upgrade. From 50dc928ba5007e0a4eb027603d60296a78ea9196 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:23:13 +0000 Subject: [PATCH 10/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index a9cb179..f6a3ec1 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -103,7 +103,7 @@ Security model of the Reader's Privacy proposal boils down to inability to *algo `hash(CID)` that is used for IPNI lookups. Right now indexer advertisments are not encrypted, but authenticated and contain plain CID values in them. That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(CID) -> CID` by re-ingesting advertisements chain from each publisher so that they can use it to decrypt the protocol. -Doing that will require significant investment into infrastructure and will be eliminated as a possibility after *Writer's Privacy* upgrade. +Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. Reader's Privacy is a first step towards fully private content routing protocol. From 06889cdc9fdfa9a1f86d7c41de33174eb5394fde Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:23:26 +0000 Subject: [PATCH 11/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index f6a3ec1..7aae181 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -102,7 +102,7 @@ however that CID is never revealed. Security model of the Reader's Privacy proposal boils down to inability to *algorithmically* derive the original CID value for a `hash(CID)` that is used for IPNI lookups. Right now indexer advertisments are not encrypted, but authenticated and contain plain CID values in them. That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(CID) -> CID` -by re-ingesting advertisements chain from each publisher so that they can use it to decrypt the protocol. +by re-ingesting advertisements chain from each publisher in order to collect all original CIDs which can then be used to decrypt provider records and so on. Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. Reader's Privacy is a first step towards fully private content routing protocol. From 0ea3bf8e36ff8545a5a74f57b208d2277f0c9844 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:23:33 +0000 Subject: [PATCH 12/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 7aae181..2bec6fd 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -105,7 +105,7 @@ That is going to change once *Writer Privacy* is implemented. Until then, an att by re-ingesting advertisements chain from each publisher in order to collect all original CIDs which can then be used to decrypt provider records and so on. Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. -Reader's Privacy is a first step towards fully private content routing protocol. +Reader Privacy is a first step towards fully private content routing protocol. Wider security implications are discussed in the IPFS Reader's Privacy specification: TODO link here. ​ From 1d39505d3891044b9cf9323b0ed42ef2142e9a83 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:23:41 +0000 Subject: [PATCH 13/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 2bec6fd..0a97520 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -107,7 +107,7 @@ Doing that will require significant resources as it involves crawling the entire Reader Privacy is a first step towards fully private content routing protocol. -Wider security implications are discussed in the IPFS Reader's Privacy specification: TODO link here. +Wider security implications are discussed in the IPFS Reader Privacy specification: TODO link here. ​ ## Related Resources ​ From 14354e28d4827c88da97ae8adbd999bcbb629d6f Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:24:35 +0000 Subject: [PATCH 14/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 0a97520..9823896 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -88,7 +88,7 @@ to get hold of the original CID that isn't revealed during the communication rou * Using the original CID, the client would decrypt `ProviderRecordKey`s and then calculate another hash over the decrypted `hash(peerID)` part of it. Using that hash for each `ProviderRecordKey` the client would do another lookup to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about provider, -such as it's *peerID*, *multiaddresses*, *supported protocols* and etc. Each `ProviderRecord` will be encrypted +such as it's *peerID*, *multiaddrs*, *supported protocols* and so on. Each `ProviderRecord` will be encrypted with a key derived from `hash(peerID)`. In order to make sense of that payload, a passive observer would need to get hold of the decrypted `ProviderRecordKey` that isn't revealed during the communication round; * Using the `hash(peerID)` from `ProviderRecordKey`s, the client would decrypt `ProviderRecord`s and then reach out to the From efadf02b932a86c822474961fd50603c7b90a35f Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:24:50 +0000 Subject: [PATCH 15/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 9823896..4909215 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -94,7 +94,7 @@ get hold of the decrypted `ProviderRecordKey` that isn't revealed during the com * Using the `hash(peerID)` from `ProviderRecordKey`s, the client would decrypt `ProviderRecord`s and then reach out to the provider directly to fetch the desired content. -By utilising such scheme only a party that knows original CID that is being looked up can decode the protocol, +By utilising such scheme only a party that knows original CID can decode the protocol, however that CID is never revealed. ### Security From 7b8b876099ecb3ea4cdd27682a8942aaa114b331 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:25:00 +0000 Subject: [PATCH 16/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 4909215..c596bd8 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -95,7 +95,7 @@ get hold of the decrypted `ProviderRecordKey` that isn't revealed during the com provider directly to fetch the desired content. By utilising such scheme only a party that knows original CID can decode the protocol, -however that CID is never revealed. +and that CID is never revealed. ### Security ​ From 4cd170521ac92e280e0384e55721d54c8cfa9236 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:25:08 +0000 Subject: [PATCH 17/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index c596bd8..bf7bc82 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -99,7 +99,7 @@ and that CID is never revealed. ### Security ​ -Security model of the Reader's Privacy proposal boils down to inability to *algorithmically* derive the original CID value for a +Security model of the Reader Privacy proposal boils down to inability to *algorithmically* derive the original CID value for a `hash(CID)` that is used for IPNI lookups. Right now indexer advertisments are not encrypted, but authenticated and contain plain CID values in them. That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(CID) -> CID` by re-ingesting advertisements chain from each publisher in order to collect all original CIDs which can then be used to decrypt provider records and so on. From fc9f95410ae7cbbf1da82b8c7f6010fc852fe38e Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 9 Dec 2022 16:26:05 +0000 Subject: [PATCH 18/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index bf7bc82..261c5ea 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -100,7 +100,7 @@ and that CID is never revealed. ### Security ​ Security model of the Reader Privacy proposal boils down to inability to *algorithmically* derive the original CID value for a -`hash(CID)` that is used for IPNI lookups. Right now indexer advertisments are not encrypted, but authenticated and contain plain CID values in them. +`hash(CID)` that is used for IPNI lookups. Right now advertisments are not encrypted, but authenticated and contain plain CID values in them. That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(CID) -> CID` by re-ingesting advertisements chain from each publisher in order to collect all original CIDs which can then be used to decrypt provider records and so on. Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. From e848b3c6bb6773591247252aa6ab3d5d0fe36cdc Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 9 Dec 2022 16:42:08 +0000 Subject: [PATCH 19/49] Add mermaid diagram --- reader-privacy.md | 37 ++++++++++++++++++++++++++++--------- 1 file changed, 28 insertions(+), 9 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 261c5ea..a9e5a6e 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -73,19 +73,20 @@ on the diagram below: ## Specification ​ -This specification improves the reader privacy by proposing changes to the Step 3, depicted above, where the client supplies the content CID directly in order to lookup its corresponding providers. +This specification improves the reader privacy by proposing changes to the Step 3, depicted above, where the client +supplies the content CID directly in order to lookup its corresponding providers. In order to protect the reader's privacy the proposal changes the way CID lookup works to the following: -* A client who wants to do a lookup will calculate a hash over the CID (`hash(CID)`) and use it for the +* A client who wants to do a lookup will calculate a hash over the CID's multihash (`hash(MH)`) and use it for the lookup query (hence the name double hashing); * In response to the hashed find request, the indexer will return a set of encrypted `ProviderRecordKey`s. `ProviderRecordKey` will consist of two concatenated hashes - one over `peerID` and the other over `contextID`. -Each `ProviderRecordKey` will be encrypted with a key derived from the *original* CID value: -`enc(hash(peerID) || hash(contextID), CID)`, where `hash` is a hash over the value, and `||` is concatenation +Each `ProviderRecordKey` will be encrypted with a key derived from the *original* multihash value: +`enc(hash(peerID) || hash(contextID), MH)`, where `hash` is a hash over the value, and `||` is concatenation and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need to get hold of the original CID that isn't revealed during the communication round; -* Using the original CID, the client would decrypt `ProviderRecordKey`s and then calculate another hash +* Using the original multihash, the client would decrypt `ProviderRecordKey`s and then calculate another hash over the decrypted `hash(peerID)` part of it. Using that hash for each `ProviderRecordKey` the client would do another lookup to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about provider, such as it's *peerID*, *multiaddrs*, *supported protocols* and so on. Each `ProviderRecord` will be encrypted @@ -97,12 +98,30 @@ provider directly to fetch the desired content. By utilising such scheme only a party that knows original CID can decode the protocol, and that CID is never revealed. + +```mermaid +sequenceDiagram + participant client + participant indexer + participant provider + client->>client: calculates hash(MH) + client->>indexer: sends a find request for hash(MH) + indexer->>client: sends a list of [ProviderRecordKey], each encrypted with a key derived from MH + loop ProviderRecordKeys + client->>client: decrypts ProviderRecordKey and extracts hash(peerID) from it + client->>indexer: sends ProviderRecord lookup request for hash(hash(peerID)) + indexer->>client: sends a ProviderRecord encrypted with a key derived from hash(peerID) + client->>client: decrypts the ProviderRecord + client->>provider: reaches out to the provider for the desired content +``` + + ### Security ​ -Security model of the Reader Privacy proposal boils down to inability to *algorithmically* derive the original CID value for a -`hash(CID)` that is used for IPNI lookups. Right now advertisments are not encrypted, but authenticated and contain plain CID values in them. -That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(CID) -> CID` -by re-ingesting advertisements chain from each publisher in order to collect all original CIDs which can then be used to decrypt provider records and so on. +Security model of the Reader Privacy proposal boils down to inability to *algorithmically* derive the original multihash value for a +`hash(multihash)` that is used for IPNI lookups. Right now advertisments are not encrypted, but authenticated and contain plain multihash values in them. +That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(multihash) -> multihash` +by re-ingesting advertisements chain from each publisher in order to collect all original multihashes which can then be used to decrypt provider records and so on. Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. Reader Privacy is a first step towards fully private content routing protocol. From 7242649218c50b6a5f40bd7e46792e8c6001eebb Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 9 Dec 2022 16:44:12 +0000 Subject: [PATCH 20/49] Add mermaid diagram --- reader-privacy.md | 1 + 1 file changed, 1 insertion(+) diff --git a/reader-privacy.md b/reader-privacy.md index a9e5a6e..9d855d6 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -112,6 +112,7 @@ sequenceDiagram client->>indexer: sends ProviderRecord lookup request for hash(hash(peerID)) indexer->>client: sends a ProviderRecord encrypted with a key derived from hash(peerID) client->>client: decrypts the ProviderRecord + end client->>provider: reaches out to the provider for the desired content ``` From e4073bec4ab90c7a159c96addae42ee6cec3c696 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Mon, 12 Dec 2022 18:50:32 +0000 Subject: [PATCH 21/49] Update double-hashing spec --- reader-privacy.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 9d855d6..3a2ef4f 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -81,18 +81,18 @@ In order to protect the reader's privacy the proposal changes the way CID lookup * A client who wants to do a lookup will calculate a hash over the CID's multihash (`hash(MH)`) and use it for the lookup query (hence the name double hashing); * In response to the hashed find request, the indexer will return a set of encrypted `ProviderRecordKey`s. -`ProviderRecordKey` will consist of two concatenated hashes - one over `peerID` and the other over `contextID`. +`ProviderRecordKey` will consist of the `peerID` concatenated with a hash over `contextID`. Each `ProviderRecordKey` will be encrypted with a key derived from the *original* multihash value: -`enc(hash(peerID) || hash(contextID), MH)`, where `hash` is a hash over the value, and `||` is concatenation +`enc(peerID || hash(contextID), MH)`, where `hash` is a hash over the value, and `||` is concatenation and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need to get hold of the original CID that isn't revealed during the communication round; -* Using the original multihash, the client would decrypt `ProviderRecordKey`s and then calculate another hash -over the decrypted `hash(peerID)` part of it. Using that hash for each `ProviderRecordKey` the client would do another lookup -to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about provider, +* Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate a hash +over the decrypted `peerID` part of it. Using such hash for each `ProviderRecordKey` the client would do another lookup +to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about the provider, such as it's *peerID*, *multiaddrs*, *supported protocols* and so on. Each `ProviderRecord` will be encrypted -with a key derived from `hash(peerID)`. In order to make sense of that payload, a passive observer would need to -get hold of the decrypted `ProviderRecordKey` that isn't revealed during the communication round; -* Using the `hash(peerID)` from `ProviderRecordKey`s, the client would decrypt `ProviderRecord`s and then reach out to the +with a key derived from `peerID`. In order to make sense of that payload, a passive observer would need to +get hold of the original `peerID` that isn't revealed during the communication round; +* Using a key derived from `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the provider directly to fetch the desired content. By utilising such scheme only a party that knows original CID can decode the protocol, @@ -108,9 +108,9 @@ sequenceDiagram client->>indexer: sends a find request for hash(MH) indexer->>client: sends a list of [ProviderRecordKey], each encrypted with a key derived from MH loop ProviderRecordKeys - client->>client: decrypts ProviderRecordKey and extracts hash(peerID) from it - client->>indexer: sends ProviderRecord lookup request for hash(hash(peerID)) - indexer->>client: sends a ProviderRecord encrypted with a key derived from hash(peerID) + client->>client: decrypts ProviderRecordKey and extracts peerID from it + client->>indexer: sends ProviderRecord lookup request for hash(peerID) + indexer->>client: sends a ProviderRecord encrypted with a key derived from peerID client->>client: decrypts the ProviderRecord end client->>provider: reaches out to the provider for the desired content From 3817042515a9e7fb8569b5d1d9dbe98a95abcc4d Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Wed, 14 Dec 2022 19:27:06 +0000 Subject: [PATCH 22/49] Add trade offs --- reader-privacy.md | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 3a2ef4f..db9587f 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -41,6 +41,8 @@ provider lookup. - [Background](#background) - [Specification](#specification) - [Security](#security) + - [Hashing and Encryption Function Upgrades](#hashing-and-encryption-function-upgrades) + - [Trade Offs](#trade-offs) - [Related Resources](#related-resources) ​ ## Introduction @@ -128,7 +130,42 @@ Doing that will require significant resources as it involves crawling the entire Reader Privacy is a first step towards fully private content routing protocol. Wider security implications are discussed in the IPFS Reader Privacy specification: TODO link here. -​ + +#### Hashing and Encryption Function Upgrades + +All multihashes have a codec encoded in them. If a hashing or encryption funciton will have to rotate then different types of multihahses can coexist together +and can be processed differently by IPNI implementations. It won't be possible to apply a fix retroactivelly to the data returned by previous lookup requests, +however IPNI implementations should start blocking all new ones that use a compromised scheme. + +Moving an IPNI implementation to a new hash / encryption function will involve reingesting all data from a scratch. Before Writer Privacy is impemented the +index can be migrated over to new functions by reingesting all advertisement chains. With Writer Privacy, Publishers will have to republish advertisments +using new algorithms. Both old and new scheme can coexist together for some time. The old one should be retired either immediately or once +the indexes have been rebuilt and the users have been migrated over. + +Exact operation procedure will be different for differnet IPNI implementations. + +### Trade Offs + +* **Multiple lookups**. In the simplest scenario Reader Privacy protocol will require at least two roundtrips to find provider details for a given CID. +It can be reduced down to one by caching `ProviderRecord`s locally at the client side. That would eliminate a need in a lookup +per decrypted `ProviderRecordKey`. In the future there can be a separate service that distributes `PeerID` to `Multiaddresses` mappings in open. +That dataset can be periodically downloaded by all clients and cached locally; + +* **Extra compute**. At minimum, clients will require to perform an extra hashing per CID and decryption per `ProviderRecordKey` that will add +some overhead to each lookup. That overhead can be initially offloaded to the server but will have to be done by clients eventually (explained below); + +* **Extra storage space**. Storing encrypted data will require more space due to paddings and nonce; + +* **Bulk deletes**. Encrypted `PeerID` will be different for each multihash and hence bulk delete operations (delete everything for a provider X) will not be possible. + +* **Operational overhead**. Reader Privacy roll out will be a gradual process as many clients will have to migrate over. During the transition period +IPNI implementations will have to serve both plain and hashed lookups. That will involve either: + - spinning up separate IPNI instances for hashed queries or; + - serving hashed and regular queries from the same instances using encrypted dataset. That means that servers will have to do decryption on behalf of their clients + using a plain multihash that has been provided in the lookup request; + +* **Data Migration**. Existing indexes will have to undergo data migration or to be wiped out complletely and rebuilt again. + ## Related Resources ​ TODO: link to corresponding IPFS spec once materialised. From 84ece7a7285e1125d17f878e2ebab7bda7c4958b Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Thu, 15 Dec 2022 10:43:34 +0000 Subject: [PATCH 23/49] Add threat modeling section --- reader-privacy.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/reader-privacy.md b/reader-privacy.md index db9587f..58936a1 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -166,6 +166,25 @@ IPNI implementations will have to serve both plain and hashed lookups. That will * **Data Migration**. Existing indexes will have to undergo data migration or to be wiped out complletely and rebuilt again. +### Threat Modeling + +There are three actors involved into the IPNI workflow: provider, client and indexer. Providers update index by publishing advertisements. +Indexer advertisements are signed by their publishers and can be verified for authenticity. Advertisements are organised in a chain and are ingested strictly in order. +It's not possible to change the order wihtout having to create a new chain. Advertisements processing is idempotent - re-ingesting the same advertismeent twice +shouldn't affect the indexer's state. The IPNI specification is agnostic to transport protocols so particular protocol choice is up to the implementation. +Compromised publisher's identity is out of scope of this specification. + +Clients consume index by performing CID lookups. This specification introduces additional hashing and encryption that aim to prevent a passive observer +from being able to figure out what data is being looked up by spying at the client to indexer traffic. The exact communicaton protocol is out of scope for this specification +however it should be chosen carefully to prevent MITM attacks. A passive observer could deobfuscate the content by building a `hash(Multihash)` to `Multihash` map. +That can be done before the Writer's Privacy upgrade however will not be possible eventually as mentioned in the [Security](#security) section. + +Provider records returned to client lookups do not contain any authentication data. It's possible for a malicious / buggy indexer to +present a wrong dataset to the client. Clients can tackle that problem by excluding such indexers from their pool. Returning wrong datasets will +eventually affect the indexer's reputation score - these efforts are already in progress. Data integrity is inbuilt into IPFS - clients +can verify that the data returned by a storage provider matches the CID. So even if indexer get compromised that will not compromise +the data itself. + ## Related Resources ​ TODO: link to corresponding IPFS spec once materialised. From 0c80a2dcf412e4d9e34c06cdb93685d9b0fbfcb9 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Thu, 15 Dec 2022 10:45:59 +0000 Subject: [PATCH 24/49] Add threat modeling section --- reader-privacy.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 58936a1..266890e 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -43,6 +43,7 @@ provider lookup. - [Security](#security) - [Hashing and Encryption Function Upgrades](#hashing-and-encryption-function-upgrades) - [Trade Offs](#trade-offs) + - [Threat Modelling](#threat-modelling) - [Related Resources](#related-resources) ​ ## Introduction @@ -166,7 +167,7 @@ IPNI implementations will have to serve both plain and hashed lookups. That will * **Data Migration**. Existing indexes will have to undergo data migration or to be wiped out complletely and rebuilt again. -### Threat Modeling +### Threat Modelling There are three actors involved into the IPNI workflow: provider, client and indexer. Providers update index by publishing advertisements. Indexer advertisements are signed by their publishers and can be verified for authenticity. Advertisements are organised in a chain and are ingested strictly in order. From aa818ccdd6cad468f73e10a5154d4e42241a4346 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Fri, 16 Dec 2022 10:23:04 +0000 Subject: [PATCH 25/49] Update reader-privacy.md Co-authored-by: Guillaume Michel - guissou --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 266890e..5e78618 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -134,7 +134,7 @@ Wider security implications are discussed in the IPFS Reader Privacy specificati #### Hashing and Encryption Function Upgrades -All multihashes have a codec encoded in them. If a hashing or encryption funciton will have to rotate then different types of multihahses can coexist together +All multihashes have a codec encoded in them. If a hashing or encryption function will have to rotate then different types of multihashes can coexist together and can be processed differently by IPNI implementations. It won't be possible to apply a fix retroactivelly to the data returned by previous lookup requests, however IPNI implementations should start blocking all new ones that use a compromised scheme. From 7c16595de786b95ae989f52cf7b611ff3e040be0 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 16 Dec 2022 10:30:05 +0000 Subject: [PATCH 26/49] Update reader privacy spec --- reader-privacy.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 5e78618..00481d3 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -134,9 +134,9 @@ Wider security implications are discussed in the IPFS Reader Privacy specificati #### Hashing and Encryption Function Upgrades -All multihashes have a codec encoded in them. If a hashing or encryption function will have to rotate then different types of multihashes can coexist together +All multihashes have a codec encoded in them. If a hashing or encryption funciton will have to rotate then different types of multihahses can coexist together and can be processed differently by IPNI implementations. It won't be possible to apply a fix retroactivelly to the data returned by previous lookup requests, -however IPNI implementations should start blocking all new ones that use a compromised scheme. +however IPNI implementations can start blocking all new ones that use a compromised scheme, allowing some transtition period. Moving an IPNI implementation to a new hash / encryption function will involve reingesting all data from a scratch. Before Writer Privacy is impemented the index can be migrated over to new functions by reingesting all advertisement chains. With Writer Privacy, Publishers will have to republish advertisments From 34968ef4d790bee8027b5871e78c4cf1372750e4 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 3 Jan 2023 09:49:01 +0000 Subject: [PATCH 27/49] Fix definition of ProviderRecord --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 00481d3..1594bb3 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -92,7 +92,7 @@ to get hold of the original CID that isn't revealed during the communication rou * Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate a hash over the decrypted `peerID` part of it. Using such hash for each `ProviderRecordKey` the client would do another lookup to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about the provider, -such as it's *peerID*, *multiaddrs*, *supported protocols* and so on. Each `ProviderRecord` will be encrypted +such as it's *peerID* and *multiaddrs*. Each `ProviderRecord` will be encrypted with a key derived from `peerID`. In order to make sense of that payload, a passive observer would need to get hold of the original `peerID` that isn't revealed during the communication round; * Using a key derived from `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the From cff45211d40a30ba6d899b2678b0674928cfe7c9 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Tue, 3 Jan 2023 16:37:04 +0000 Subject: [PATCH 28/49] Update reader-privacy.md Co-authored-by: Masih H. Derkani --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 1594bb3..1987101 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -92,7 +92,7 @@ to get hold of the original CID that isn't revealed during the communication rou * Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate a hash over the decrypted `peerID` part of it. Using such hash for each `ProviderRecordKey` the client would do another lookup to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about the provider, -such as it's *peerID* and *multiaddrs*. Each `ProviderRecord` will be encrypted +consisting of its *peerID* and *multiaddrs*. Each `ProviderRecord` will be encrypted with a key derived from `peerID`. In order to make sense of that payload, a passive observer would need to get hold of the original `peerID` that isn't revealed during the communication round; * Using a key derived from `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the From 8d5d6028484f3533466c504048712cb3f8ce8915 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Wed, 4 Jan 2023 09:20:02 +0000 Subject: [PATCH 29/49] Minor fixes --- reader-privacy.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 1987101..25a2f7a 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -122,9 +122,9 @@ sequenceDiagram ### Security ​ -Security model of the Reader Privacy proposal boils down to inability to *algorithmically* derive the original multihash value for a -`hash(multihash)` that is used for IPNI lookups. Right now advertisments are not encrypted, but authenticated and contain plain multihash values in them. -That is going to change once *Writer Privacy* is implemented. Until then, an attacker could build a map of `hash(multihash) -> multihash` +Security model of the Reader Privacy proposal boils down to inability of an attacker to *algorithmically* derive the original multihash value from +`hash(multihash)` that is used for IPNI lookups. IPNI advertisments are not encrypted, but authenticated and contain plain multihash values in them. +Before Writer Privacy is implemented an attacker could build a map of `hash(multihash) -> multihash` by re-ingesting advertisements chain from each publisher in order to collect all original multihashes which can then be used to decrypt provider records and so on. Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. @@ -138,17 +138,17 @@ All multihashes have a codec encoded in them. If a hashing or encryption funcito and can be processed differently by IPNI implementations. It won't be possible to apply a fix retroactivelly to the data returned by previous lookup requests, however IPNI implementations can start blocking all new ones that use a compromised scheme, allowing some transtition period. -Moving an IPNI implementation to a new hash / encryption function will involve reingesting all data from a scratch. Before Writer Privacy is impemented the -index can be migrated over to new functions by reingesting all advertisement chains. With Writer Privacy, Publishers will have to republish advertisments -using new algorithms. Both old and new scheme can coexist together for some time. The old one should be retired either immediately or once -the indexes have been rebuilt and the users have been migrated over. +Moving an IPNI implementation to a new hash / encryption function will require reingesting all data from a scratch. Before Writer Privacy is impemented the +index can be migrated over to new functions by reingesting existing advertisement chains. With Writer Privacy, Publishers will have to republish advertisments +using new functions (as the data in the advertisements themselves will have to be re-hashed / re-encrypted). Both old and new scheme can coexist together for some time. +The old scheme should be retired either immediately or once the indexes have been rebuilt and the users have been migrated over. -Exact operation procedure will be different for differnet IPNI implementations. +An exact operational procedure will be different for differnet IPNI implementations. ### Trade Offs * **Multiple lookups**. In the simplest scenario Reader Privacy protocol will require at least two roundtrips to find provider details for a given CID. -It can be reduced down to one by caching `ProviderRecord`s locally at the client side. That would eliminate a need in a lookup +It can be reduced down to one by caching `ProviderRecord`s at the client side, which would eliminate a need in a lookup per decrypted `ProviderRecordKey`. In the future there can be a separate service that distributes `PeerID` to `Multiaddresses` mappings in open. That dataset can be periodically downloaded by all clients and cached locally; @@ -170,15 +170,15 @@ IPNI implementations will have to serve both plain and hashed lookups. That will ### Threat Modelling There are three actors involved into the IPNI workflow: provider, client and indexer. Providers update index by publishing advertisements. -Indexer advertisements are signed by their publishers and can be verified for authenticity. Advertisements are organised in a chain and are ingested strictly in order. -It's not possible to change the order wihtout having to create a new chain. Advertisements processing is idempotent - re-ingesting the same advertismeent twice +Indexer advertisements are signed by their publishers and can be authenticated. Advertisements are organised in a chain and are ingested strictly in order. +It's not possible to reorder advertisements wihtout forking the chain. Advertisements processing is idempotent - re-ingesting the same advertismeent twice shouldn't affect the indexer's state. The IPNI specification is agnostic to transport protocols so particular protocol choice is up to the implementation. Compromised publisher's identity is out of scope of this specification. Clients consume index by performing CID lookups. This specification introduces additional hashing and encryption that aim to prevent a passive observer -from being able to figure out what data is being looked up by spying at the client to indexer traffic. The exact communicaton protocol is out of scope for this specification -however it should be chosen carefully to prevent MITM attacks. A passive observer could deobfuscate the content by building a `hash(Multihash)` to `Multihash` map. -That can be done before the Writer's Privacy upgrade however will not be possible eventually as mentioned in the [Security](#security) section. +from being able to infer what data is being looked up by spying at the client to indexer traffic. The exact communicaton protocol is out of scope for this specification +however it should be chosen carefully to prevent MITM attacks. Before Writer Privacy upgrade a passive observer could deobfuscate the content by building a `hash(Multihash)` to `Multihash` map. +That however will not be possible eventually as mentioned in the [Security](#security) section. Provider records returned to client lookups do not contain any authentication data. It's possible for a malicious / buggy indexer to present a wrong dataset to the client. Clients can tackle that problem by excluding such indexers from their pool. Returning wrong datasets will From c9cc59fc1aa7c29ef343c6de69b3ebf1ace8c435 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 17 Jan 2023 16:57:22 +0000 Subject: [PATCH 30/49] Add reader privacy implementation details --- reader-privacy.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 25a2f7a..2f239c2 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -45,6 +45,8 @@ provider lookup. - [Trade Offs](#trade-offs) - [Threat Modelling](#threat-modelling) - [Related Resources](#related-resources) + +For more technical implementation details please see the [Addendum](reader-privacy-addendum.md). ​ ## Introduction ​ @@ -84,9 +86,9 @@ In order to protect the reader's privacy the proposal changes the way CID lookup * A client who wants to do a lookup will calculate a hash over the CID's multihash (`hash(MH)`) and use it for the lookup query (hence the name double hashing); * In response to the hashed find request, the indexer will return a set of encrypted `ProviderRecordKey`s. -`ProviderRecordKey` will consist of the `peerID` concatenated with a hash over `contextID`. +`ProviderRecordKey` will consist of the `peerID` concatenated with `contextID`. Each `ProviderRecordKey` will be encrypted with a key derived from the *original* multihash value: -`enc(peerID || hash(contextID), MH)`, where `hash` is a hash over the value, and `||` is concatenation +`enc(peerID || contextID, MH)`, where `hash` is a hash over the value, and `||` is concatenation and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need to get hold of the original CID that isn't revealed during the communication round; * Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate a hash @@ -97,11 +99,13 @@ with a key derived from `peerID`. In order to make sense of that payload, a pass get hold of the original `peerID` that isn't revealed during the communication round; * Using a key derived from `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the provider directly to fetch the desired content. +* The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. +That will require another lookup by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. +*This step will not be required for IPFS as Bitswap protocol is assumed implicitly.* By utilising such scheme only a party that knows original CID can decode the protocol, and that CID is never revealed. - ```mermaid sequenceDiagram participant client @@ -115,6 +119,9 @@ sequenceDiagram client->>indexer: sends ProviderRecord lookup request for hash(peerID) indexer->>client: sends a ProviderRecord encrypted with a key derived from peerID client->>client: decrypts the ProviderRecord + client->>indexer: [Optional] sends Metadata lookup request for hash(ProviderRecordKey) + indexer->>client: [Optional] sends Metadata encrypted with a key derived from ProviderRecordKey + client->>client: [Optional] decrypts the Metadata end client->>provider: reaches out to the provider for the desired content ``` @@ -143,6 +150,8 @@ index can be migrated over to new functions by reingesting existing advertisemen using new functions (as the data in the advertisements themselves will have to be re-hashed / re-encrypted). Both old and new scheme can coexist together for some time. The old scheme should be retired either immediately or once the indexes have been rebuilt and the users have been migrated over. +Encrypted values will be expected to have algorithm and nonce encoded in them so that encryption function rotation doesn't require coordinated client upgrade. + An exact operational procedure will be different for differnet IPNI implementations. ### Trade Offs From 8b74305c93ea2f707e6fb594fbe76a7929a8b4d0 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 17 Jan 2023 16:58:16 +0000 Subject: [PATCH 31/49] Add reader privacy implementation details --- reader-privacy-addendum.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 reader-privacy-addendum.md diff --git a/reader-privacy-addendum.md b/reader-privacy-addendum.md new file mode 100644 index 0000000..10c73d7 --- /dev/null +++ b/reader-privacy-addendum.md @@ -0,0 +1,30 @@ +# Reader Privacy Implementation Details + +This addendum goes through Reader Privacy protocol, hashing and cryptography choices as well as explains some less obvious details +for the implementations to consider. + +## Hashing + +IPNI expects SHA256 as a hashing function. To avoid collisions between the hashes calculated from the same value but for different purposes (for example +to derive an encryption key versus derive a lookup key) a constant string can be appended to the value before calculating a hash over it. For example +`sha256("AESGCM" + multihash)` to derive an encryption key and `sha256("CR_DOUBLEHASH" + multihash)` to calculate a second hash over the multihash. + +All hashed data that is used for lookups must be of `Multihash` format with `SHA_256` codec. Double hashed data must use `DBL_SHA_256` codec. + +Multihashes must be prepended with `DBL_SHA_256` before calculating a second hash. Unhashed data must be prepended with `SHA_256` before calculating the first hash. + +## Encryption + +IPNI uses AES in GCM mode for encryption. AES keys must be 32-bytes long and derived from the passpharse by calculating a SHA256 over it. AES keys must be prepended with `AESGCM` before hashing. +One particular detail - is a careful choice of 12-byte IV. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). +Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. +That means that the IV needs to be deterministically chosen so that `enc(IV, passphrase, payload)` produces the same output for the same +passpharase + payload pair. One strategy could be to deterministically derive an IV from the passphrase or to generate it randomly and store +the mapping on the client side. The IPNI specification doesn't enforce how an IV should be chosen and leaves that up to the client to decide. +Encrypted payload must have algorithm and nonce encoded in it: `algorithm || nonce || enc(payload)`, where `algorithm` is X bytes (TODO: define) and `nonce` is 12 bytes. + +## Data Formats + +* All binary data must be b58 encoded when transferred over the wire. +* `ProviderRecordKey`s must be created by concatenating binary `PeerID` with binary `ContextID`. There is no need for extra length / separators as they are +already encoded as a part of the `Multihash` format. \ No newline at end of file From 3ecf6a80cdd14aa90722a7656a86ebd282f95df8 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 17 Jan 2023 17:00:08 +0000 Subject: [PATCH 32/49] Formatting --- reader-privacy-addendum.md | 1 + 1 file changed, 1 insertion(+) diff --git a/reader-privacy-addendum.md b/reader-privacy-addendum.md index 10c73d7..7dcf32c 100644 --- a/reader-privacy-addendum.md +++ b/reader-privacy-addendum.md @@ -3,6 +3,7 @@ This addendum goes through Reader Privacy protocol, hashing and cryptography choices as well as explains some less obvious details for the implementations to consider. + ## Hashing IPNI expects SHA256 as a hashing function. To avoid collisions between the hashes calculated from the same value but for different purposes (for example From 51f7b5086427321e866f6af42ed31007035187ad Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 17 Jan 2023 17:01:54 +0000 Subject: [PATCH 33/49] Formatting --- reader-privacy-addendum.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/reader-privacy-addendum.md b/reader-privacy-addendum.md index 7dcf32c..41e4559 100644 --- a/reader-privacy-addendum.md +++ b/reader-privacy-addendum.md @@ -3,8 +3,7 @@ This addendum goes through Reader Privacy protocol, hashing and cryptography choices as well as explains some less obvious details for the implementations to consider. - -## Hashing +## Hashing IPNI expects SHA256 as a hashing function. To avoid collisions between the hashes calculated from the same value but for different purposes (for example to derive an encryption key versus derive a lookup key) a constant string can be appended to the value before calculating a hash over it. For example From 9a97942c9510ecbc36f7bda14073251c05daf2a0 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 17 Jan 2023 17:13:57 +0000 Subject: [PATCH 34/49] Formatting --- reader-privacy-addendum.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy-addendum.md b/reader-privacy-addendum.md index 41e4559..b61ba46 100644 --- a/reader-privacy-addendum.md +++ b/reader-privacy-addendum.md @@ -11,7 +11,7 @@ to derive an encryption key versus derive a lookup key) a constant string can be All hashed data that is used for lookups must be of `Multihash` format with `SHA_256` codec. Double hashed data must use `DBL_SHA_256` codec. -Multihashes must be prepended with `DBL_SHA_256` before calculating a second hash. Unhashed data must be prepended with `SHA_256` before calculating the first hash. +Multihashes must be prepended with `CR_DOUBLEHASH` before calculating a second hash. Unhashed data must be prepended with `CR_HASH` before calculating the first hash. ## Encryption From 293ab3855c1efae4180dc55720e75e89e3df8bbf Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Wed, 18 Jan 2023 10:57:08 +0000 Subject: [PATCH 35/49] Move addendum to the main spec --- reader-privacy-addendum.md | 30 ------------------------------ reader-privacy.md | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+), 30 deletions(-) delete mode 100644 reader-privacy-addendum.md diff --git a/reader-privacy-addendum.md b/reader-privacy-addendum.md deleted file mode 100644 index b61ba46..0000000 --- a/reader-privacy-addendum.md +++ /dev/null @@ -1,30 +0,0 @@ -# Reader Privacy Implementation Details - -This addendum goes through Reader Privacy protocol, hashing and cryptography choices as well as explains some less obvious details -for the implementations to consider. - -## Hashing - -IPNI expects SHA256 as a hashing function. To avoid collisions between the hashes calculated from the same value but for different purposes (for example -to derive an encryption key versus derive a lookup key) a constant string can be appended to the value before calculating a hash over it. For example -`sha256("AESGCM" + multihash)` to derive an encryption key and `sha256("CR_DOUBLEHASH" + multihash)` to calculate a second hash over the multihash. - -All hashed data that is used for lookups must be of `Multihash` format with `SHA_256` codec. Double hashed data must use `DBL_SHA_256` codec. - -Multihashes must be prepended with `CR_DOUBLEHASH` before calculating a second hash. Unhashed data must be prepended with `CR_HASH` before calculating the first hash. - -## Encryption - -IPNI uses AES in GCM mode for encryption. AES keys must be 32-bytes long and derived from the passpharse by calculating a SHA256 over it. AES keys must be prepended with `AESGCM` before hashing. -One particular detail - is a careful choice of 12-byte IV. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). -Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. -That means that the IV needs to be deterministically chosen so that `enc(IV, passphrase, payload)` produces the same output for the same -passpharase + payload pair. One strategy could be to deterministically derive an IV from the passphrase or to generate it randomly and store -the mapping on the client side. The IPNI specification doesn't enforce how an IV should be chosen and leaves that up to the client to decide. -Encrypted payload must have algorithm and nonce encoded in it: `algorithm || nonce || enc(payload)`, where `algorithm` is X bytes (TODO: define) and `nonce` is 12 bytes. - -## Data Formats - -* All binary data must be b58 encoded when transferred over the wire. -* `ProviderRecordKey`s must be created by concatenating binary `PeerID` with binary `ContextID`. There is no need for extra length / separators as they are -already encoded as a part of the `Multihash` format. \ No newline at end of file diff --git a/reader-privacy.md b/reader-privacy.md index 2f239c2..bbc30b6 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -44,6 +44,10 @@ provider lookup. - [Hashing and Encryption Function Upgrades](#hashing-and-encryption-function-upgrades) - [Trade Offs](#trade-offs) - [Threat Modelling](#threat-modelling) +- [Implementation](#implementation) + - [Hashing](#hashing) + - [Encryption](#encryption) + - [Data Formats](#data-formats) - [Related Resources](#related-resources) For more technical implementation details please see the [Addendum](reader-privacy-addendum.md). @@ -195,6 +199,38 @@ eventually affect the indexer's reputation score - these efforts are already in can verify that the data returned by a storage provider matches the CID. So even if indexer get compromised that will not compromise the data itself. +## Implementation + +This section goes through Reader Privacy protocol, hashing and cryptography choices as well as explains some less obvious details +for the implementations to consider. + +### Hashing + +SHA256 must be used as a hashing function. To avoid collisions between the hashes calculated from the same value but for different purposes (for example +to derive an encryption key versus derive a lookup key) a constant string can be appended to the value before calculating a hash over it. For example +`sha256("AESGCM" + multihash)` to derive an encryption key and `sha256("CR_DOUBLEHASH" + multihash)` to calculate a second hash over the multihash. + +All hashed data that is used for lookups must be of `Multihash` format with `SHA_256` codec. Double hashed data must use `DBL_SHA_256` codec. + +Multihashes must be prepended with `CR_DOUBLEHASH` before calculating a second hash. Unhashed data must be prepended with `CR_HASH` before calculating the first hash. + +### Encryption + +AES in GCM mode must be used for encryption. AES keys must be 32-bytes long and be derived from the passpharse by calculating a SHA256 over it. AES keys must be prepended with `AESGCM` before hashing them over. +One particular detail - is a careful choice of 12-byte IV. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). +Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. +That means that the IV needs to be deterministically chosen so that `enc(IV, passphrase, payload)` produces the same output for the same +passpharase + payload pair. One strategy could be to deterministically derive an IV from the passphrase or to generate it randomly and store +the mapping on the client side. The IPNI specification doesn't enforce how an IV should be chosen and leaves that up to the client to decide. + +Encrypted payload must have algorithm and nonce encoded in it: `algorithm || nonce || enc(payload)`, where `algorithm` is 1 byte long (TODO: define mapping table) and `nonce` is 12 bytes long. + +### Data Formats + +* All binary data must be b58 encoded when transferred over the wire. +* `ProviderRecordKey`s must be created by concatenating binary `PeerID` with binary `ContextID`. There is no need for extra length / separators as they are +already encoded as a part of the `Multihash` format. + ## Related Resources ​ TODO: link to corresponding IPFS spec once materialised. From 7e6f3b90826ca618cdfe2786ab5c1bccf9d8ba9f Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 24 Jan 2023 11:45:03 +0000 Subject: [PATCH 36/49] Small update to the spec --- reader-privacy.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index bbc30b6..398c061 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -154,8 +154,6 @@ index can be migrated over to new functions by reingesting existing advertisemen using new functions (as the data in the advertisements themselves will have to be re-hashed / re-encrypted). Both old and new scheme can coexist together for some time. The old scheme should be retired either immediately or once the indexes have been rebuilt and the users have been migrated over. -Encrypted values will be expected to have algorithm and nonce encoded in them so that encryption function rotation doesn't require coordinated client upgrade. - An exact operational procedure will be different for differnet IPNI implementations. ### Trade Offs @@ -223,7 +221,7 @@ That means that the IV needs to be deterministically chosen so that `enc(IV, pas passpharase + payload pair. One strategy could be to deterministically derive an IV from the passphrase or to generate it randomly and store the mapping on the client side. The IPNI specification doesn't enforce how an IV should be chosen and leaves that up to the client to decide. -Encrypted payload must have algorithm and nonce encoded in it: `algorithm || nonce || enc(payload)`, where `algorithm` is 1 byte long (TODO: define mapping table) and `nonce` is 12 bytes long. +Encrypted payload must have nonce encoded in it: `nonce || enc(payload)`, where `nonce` is 12 bytes long. ### Data Formats From b28fc7e079f0f3401af9396ac7f626dd4cdc60c1 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 27 Jan 2023 16:44:45 +0000 Subject: [PATCH 37/49] Remove few redundant sentences --- reader-privacy.md | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 398c061..a0d3693 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -96,16 +96,16 @@ Each `ProviderRecordKey` will be encrypted with a key derived from the *original and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need to get hold of the original CID that isn't revealed during the communication round; * Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate a hash -over the decrypted `peerID` part of it. Using such hash for each `ProviderRecordKey` the client would do another lookup -to get an encrypted `ProviderRecord` in response. `ProviderRecord` will contain information about the provider, -consisting of its *peerID* and *multiaddrs*. Each `ProviderRecord` will be encrypted -with a key derived from `peerID`. In order to make sense of that payload, a passive observer would need to -get hold of the original `peerID` that isn't revealed during the communication round; -* Using a key derived from `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the +over the decrypted `peerID` part of it. Using `hash(peerID)` for each `ProviderRecordKey` the client would do another lookup +round to get an encrypted `ProviderRecord` in response: `enc(ProviderRecord, peerID)`. `ProviderRecord` will contain provider's +*multiaddrs* with other possible provider-related information in the future. + Each `ProviderRecord` will be encrypted with a key derived from their `peerID`. In order to make sense of that payload, a passive observer would need to +get hold of the original `peerID` that isn't revealed during the communication round. It's important to note that +`ProviderRecord`s are cacheable and hence this rountrip can be avoided most of the times; +* Using a key derived from the `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the provider directly to fetch the desired content. * The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. That will require another lookup by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. -*This step will not be required for IPFS as Bitswap protocol is assumed implicitly.* By utilising such scheme only a party that knows original CID can decode the protocol, and that CID is never revealed. @@ -141,6 +141,11 @@ Doing that will require significant resources as it involves crawling the entire Reader Privacy is a first step towards fully private content routing protocol. +In particular, we are + +Someone wants to detect who is looking for a particular piece of content, i.e. surveilling content. For example, an IPNI endpoint that wants to know how frequently people are requesting some website it cares about. +Someone wants to do mass surveillance on readily accessible data. For example, a group running an IPNI endpoint also runs web crawlers looking for IPFS links, or runs a public HTTP gateway and can log those requests, etc. + Wider security implications are discussed in the IPFS Reader Privacy specification: TODO link here. #### Hashing and Encryption Function Upgrades From 532e92df7ab3a90f7efee3877dcd423f567e6b02 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 27 Jan 2023 16:45:58 +0000 Subject: [PATCH 38/49] Remove few redundant sentences --- reader-privacy.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index a0d3693..04cce6c 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -141,10 +141,6 @@ Doing that will require significant resources as it involves crawling the entire Reader Privacy is a first step towards fully private content routing protocol. -In particular, we are - -Someone wants to detect who is looking for a particular piece of content, i.e. surveilling content. For example, an IPNI endpoint that wants to know how frequently people are requesting some website it cares about. -Someone wants to do mass surveillance on readily accessible data. For example, a group running an IPNI endpoint also runs web crawlers looking for IPFS links, or runs a public HTTP gateway and can log those requests, etc. Wider security implications are discussed in the IPFS Reader Privacy specification: TODO link here. From 3f4e381a31261439dfa7d9962d0eca5bbb625428 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 27 Jan 2023 17:22:26 +0000 Subject: [PATCH 39/49] Tidy up grammar --- reader-privacy.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 04cce6c..8b86770 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -94,21 +94,20 @@ lookup query (hence the name double hashing); Each `ProviderRecordKey` will be encrypted with a key derived from the *original* multihash value: `enc(peerID || contextID, MH)`, where `hash` is a hash over the value, and `||` is concatenation and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need -to get hold of the original CID that isn't revealed during the communication round; -* Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate a hash -over the decrypted `peerID` part of it. Using `hash(peerID)` for each `ProviderRecordKey` the client would do another lookup -round to get an encrypted `ProviderRecord` in response: `enc(ProviderRecord, peerID)`. `ProviderRecord` will contain provider's -*multiaddrs* with other possible provider-related information in the future. - Each `ProviderRecord` will be encrypted with a key derived from their `peerID`. In order to make sense of that payload, a passive observer would need to +to get hold of the original multihash that isn't revealed during the communication round; +* Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate +`hash(peerID)` for each. Using these hashes the client will do another lookup +round to get encrypted `ProviderRecord`s in response: `enc(ProviderRecord, peerID)`. `ProviderRecord` will contain the provider's +*multiaddrs* with some other possible provider-related information in the future. In order to make a sense of that payload, a passive observer would need to get hold of the original `peerID` that isn't revealed during the communication round. It's important to note that `ProviderRecord`s are cacheable and hence this rountrip can be avoided most of the times; * Using a key derived from the `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the -provider directly to fetch the desired content. -* The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. -That will require another lookup by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. +provider directly to fetch the desired content; +* The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. +That will require another lookup round by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. -By utilising such scheme only a party that knows original CID can decode the protocol, -and that CID is never revealed. +By utilising such scheme only a party that knows the original CID can decode the protocol, +which is never revealed. ```mermaid sequenceDiagram From 40d0f6cb62cdf7723d4295d10c63a184d7d98659 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Mon, 30 Jan 2023 13:45:43 +0000 Subject: [PATCH 40/49] Add Extended Providers and more security considerations * Add a section about Extended providers in privacy preserving lookups; * Add thoughts on rogue IPNI behaviour in the Security section. --- reader-privacy.md | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 8b86770..4267321 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -40,6 +40,7 @@ provider lookup. - [Introduction](#introduction) - [Background](#background) - [Specification](#specification) + - [Extended Providers](#extended-providers) - [Security](#security) - [Hashing and Encryption Function Upgrades](#hashing-and-encryption-function-upgrades) - [Trade Offs](#trade-offs) @@ -68,8 +69,9 @@ fully private IPNI protocol that will eliminate indexers as centralised observer ### Non Goals -* Writer, i.e. content provider or publisher, Privacy, which will be done in a separate specification -* Retrieval Privacy, which is out of scope for the content routing subsystem. +* Writer, i.e. content provider or publisher, Privacy, which will be done in a separate specification; +* Retrieval Privacy, which is out of scope for the content routing subsystem; +* Rogue IPNI behaviour (explained in the [Security](#security) section). ​ ## Background ​ @@ -93,7 +95,8 @@ lookup query (hence the name double hashing); `ProviderRecordKey` will consist of the `peerID` concatenated with `contextID`. Each `ProviderRecordKey` will be encrypted with a key derived from the *original* multihash value: `enc(peerID || contextID, MH)`, where `hash` is a hash over the value, and `||` is concatenation -and `enc` is encryption over the value. In order to make sense of that payload, a passive observer would need +and `enc` is encryption over the value. *This notation is going to be used for the rest of the specification*. +In order to make sense of that payload, a passive observer would need to get hold of the original multihash that isn't revealed during the communication round; * Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate `hash(peerID)` for each. Using these hashes the client will do another lookup @@ -101,7 +104,7 @@ round to get encrypted `ProviderRecord`s in response: `enc(ProviderRecord, peerI *multiaddrs* with some other possible provider-related information in the future. In order to make a sense of that payload, a passive observer would need to get hold of the original `peerID` that isn't revealed during the communication round. It's important to note that `ProviderRecord`s are cacheable and hence this rountrip can be avoided most of the times; -* Using a key derived from the `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the +* Using the key derived from the `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the provider directly to fetch the desired content; * The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. That will require another lookup round by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. @@ -129,17 +132,33 @@ sequenceDiagram client->>provider: reaches out to the provider for the desired content ``` +### Extended Providers + +[Extended Providers](https://github.com/ipni/specs/blob/main/IPNI.md#extendedprovider) allow a publisher to add an extra information to all their past and future Advertisements +or to a single Advertisement with a specific `ContextID`. That can be done by sending just a single Advertisement without having to re-publish the whole Advertisement chain. +If present Extended Providers are applied to the IPNI output on the server which results into more `ProviderRecord`s being returned to the user. Same will not be possible +for privacy preserving lookups as the required fields such as `PeerID` and `ContextID` are opaque to the server. + +While the mechanics stays the same, applying Extended Providers to the decrypted values will have to be done at the client side. If exist, Extended Providers should be included +as a field in the `ProviderRecord` which would make them cacheable too. ### Security ​ Security model of the Reader Privacy proposal boils down to inability of an attacker to *algorithmically* derive the original multihash value from `hash(multihash)` that is used for IPNI lookups. IPNI advertisments are not encrypted, but authenticated and contain plain multihash values in them. Before Writer Privacy is implemented an attacker could build a map of `hash(multihash) -> multihash` -by re-ingesting advertisements chain from each publisher in order to collect all original multihashes which can then be used to decrypt provider records and so on. +by re-ingesting Advertisements chain from each publisher in order to collect all original multihashes which can then be used to decrypt provider records and so on. Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. -Reader Privacy is a first step towards fully private content routing protocol. +Even with both Reader and Writer Privacies in place a rogue IPNI actor might abuse the double-hashing security model. For example: +* Someone wants to detect who is looking for a particular piece of content, i.e. surveilling content. For example, an IPNI endpoint that wants to know how +frequently people are requesting some website it cares about; +* Someone wants to do mass surveillance on readily accessible data. For example, a group running an IPNI endpoint also runs web crawlers looking for IPFS links, +or runs a public HTTP gateway and can log those requests. +Rogue IPNI behaviour will be addressed by IPNI reputation system that is out of scope for this specification. + +Reader Privacy is a first step towards fully private content routing protocol. Wider security implications are discussed in the IPFS Reader Privacy specification: TODO link here. From fe8f9b9a1b122d38bc6bc094cfed6f45585bfa2f Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Mon, 30 Jan 2023 13:48:46 +0000 Subject: [PATCH 41/49] Remove CR_HASH prefix that is not used --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 4267321..fe1a95e 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -229,7 +229,7 @@ to derive an encryption key versus derive a lookup key) a constant string can be All hashed data that is used for lookups must be of `Multihash` format with `SHA_256` codec. Double hashed data must use `DBL_SHA_256` codec. -Multihashes must be prepended with `CR_DOUBLEHASH` before calculating a second hash. Unhashed data must be prepended with `CR_HASH` before calculating the first hash. +Multihashes must be prepended with `CR_DOUBLEHASH` before calculating a second hash. ### Encryption From 95ddcbf56d1c9a1d208e3ba46a8b89af271a2566 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 31 Jan 2023 12:01:07 +0000 Subject: [PATCH 42/49] Remove ProviderRecord encryption --- reader-privacy.md | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index fe1a95e..8ba1669 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -98,14 +98,11 @@ Each `ProviderRecordKey` will be encrypted with a key derived from the *original and `enc` is encryption over the value. *This notation is going to be used for the rest of the specification*. In order to make sense of that payload, a passive observer would need to get hold of the original multihash that isn't revealed during the communication round; -* Using the original multihash, the client will decrypt `ProviderRecordKey`s and then calculate -`hash(peerID)` for each. Using these hashes the client will do another lookup -round to get encrypted `ProviderRecord`s in response: `enc(ProviderRecord, peerID)`. `ProviderRecord` will contain the provider's -*multiaddrs* with some other possible provider-related information in the future. In order to make a sense of that payload, a passive observer would need to -get hold of the original `peerID` that isn't revealed during the communication round. It's important to note that +* Using the original multihash, the client will decrypt `ProviderRecordKey`s and then use +the `peerID` to fetch a `ProviderRecord`. `ProviderRecord` will contain the provider's +*multiaddrs* with some other possible provider-related information in the future. `ProviderRecord`s are cacheable and hence this rountrip can be avoided most of the times; -* Using the key derived from the `peerID`, the client will decrypt `ProviderRecord`s and then reach out to the -provider directly to fetch the desired content; +* Using the `ProviderRecord` the client will reach out to the provider directly and fetch the desired content; * The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. That will require another lookup round by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. @@ -122,9 +119,8 @@ sequenceDiagram indexer->>client: sends a list of [ProviderRecordKey], each encrypted with a key derived from MH loop ProviderRecordKeys client->>client: decrypts ProviderRecordKey and extracts peerID from it - client->>indexer: sends ProviderRecord lookup request for hash(peerID) - indexer->>client: sends a ProviderRecord encrypted with a key derived from peerID - client->>client: decrypts the ProviderRecord + client->>indexer: sends ProviderRecord lookup request for peerID + indexer->>client: sends a ProviderRecord client->>indexer: [Optional] sends Metadata lookup request for hash(ProviderRecordKey) indexer->>client: [Optional] sends Metadata encrypted with a key derived from ProviderRecordKey client->>client: [Optional] decrypts the Metadata From 51a2f02a0fa8cca746faa8ebdbf15554140e9d29 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 31 Jan 2023 13:45:36 +0000 Subject: [PATCH 43/49] Add info about libp2p peerstore --- reader-privacy.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 8ba1669..1d423cc 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -98,11 +98,12 @@ Each `ProviderRecordKey` will be encrypted with a key derived from the *original and `enc` is encryption over the value. *This notation is going to be used for the rest of the specification*. In order to make sense of that payload, a passive observer would need to get hold of the original multihash that isn't revealed during the communication round; -* Using the original multihash, the client will decrypt `ProviderRecordKey`s and then use +* Using the original multihash, the client will decrypt `ProviderRecordKey`s and use the `peerID` to fetch a `ProviderRecord`. `ProviderRecord` will contain the provider's *multiaddrs* with some other possible provider-related information in the future. -`ProviderRecord`s are cacheable and hence this rountrip can be avoided most of the times; -* Using the `ProviderRecord` the client will reach out to the provider directly and fetch the desired content; +`ProviderRecord`s are cacheable and hence this rountrip can be avoided most of the times. Peer addresses can +also be discovered through alternative sources such as libp2p peerstore ; +* Using addresses from the `ProviderRecord` the client will reach out to the provider directly and fetch the desired content; * The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. That will require another lookup round by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. From 600b664de8e8eadcd4851a3c14ed4dc15f3df1f9 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Tue, 14 Feb 2023 10:18:26 +0000 Subject: [PATCH 44/49] Fix typo --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 1d423cc..288e257 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -63,7 +63,7 @@ which client during the routing process, as the potential adversary easily learn A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. This is obviously undesirable and has been for some time now a strong request from the community. -The changes described in this specification introduce a IPNI Readres Privacy upgrade. It will prevent +The changes described in this specification introduce a IPNI Readers Privacy upgrade. It will prevent passive observers from tracking user's actions as described above. It will also be a first step towards fully private IPNI protocol that will eliminate indexers as centralised observers. From 891c9f51fae06c8eba66a520c429c5a5d50afe3c Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 3 Mar 2023 12:09:15 +0000 Subject: [PATCH 45/49] Align with the DHT specification --- reader-privacy.md | 178 +++++++++++++++++++++++----------------------- 1 file changed, 89 insertions(+), 89 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 288e257..2ae49b2 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -82,51 +82,81 @@ on the diagram below: ​ ![Index building flow](resources/readers-privacy-1.png) +## Magic Values + +All salts below are 64-bytes long, and represent a string padded with `\x00`. + +- `SALT_DOUBLEHASH = bytes("CR_DOUBLEHASH\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` +- `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` +- `SALT_NONCE = bytes("CR_NONCE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` + + +## Definitions + +- **Advertisement** is [IPNI Advertisement](https://github.com/ipni/storetheindex/blob/main/api/v0/ingest/schema/schema.ipldsch#L40). +- **Storage Provider** is a party who stores the data. +- **Publisher** is a party who publishes CIDs into IPNI on behalf of a Storage Provider. +- **Client** is a party who wants to find the content by its CID using IPNI. +- **Passive Observer** a rogue party that tries to understand what content is being looked up by observing Client to IPNI traffic. +- **`enc`** is [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. +- **`hash`** is [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. +- **`||`** is concatenation of two values. +- **`deriveKey`** is a process for deriving 32-byte encryption key from a passphrase taht must be done as `hash(SALT_ENCRYPTIONKEY || passphrase)`. +- **`Nonce`** is a 12-byte Nonce used as Initialization Vector (IV) for the AES-GCM encryption. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). +Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. +That means that the nonce has to be deterministically chosen so that `enc(IV, passphrase, payload)` produces the same output for the same +`passpharase` + `payload` pair. Nonce must be calculated as `sha256(SALT_NONCE || passphrase || len(payload) || payload)[:12]`, where `len(payload)` is +an 8 byte length of the `payload` encoded in Little Endian format. Choice of nonce is not enforced by the IPNI specification. The described approach will +be used while IPNI encrypts Advertisements on behaf of Publishers. However once Writer Privacy is implemented, the choice of nonce will be left up to the Publisher. +- **`CID`** is the [Content IDentifier](https://github.com/multiformats/cid). +- **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the +digest of a hash function over some content. `MH` is represented as a 32-byte array. +- **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. +The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. +- **`ProviderRecord`** is a data structure that contains such information about Storage Provider as ther PeerID and Addresses. +- **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for extra length / separators as they are +already encoded as a part of the `Multihash` format. +- **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. +- **`HashProviderRecordKey`** is a hash over `ProviderRecordKey` that must be calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. +- **`Metadata`** is IPNI metadata that is supplied in Advertisements. +- **`EncMetadata`** is `Nonce || enc(deriveKey(ProviderRecordKey), Nonce, Metadata)`. + ## Specification ​ -This specification improves the reader privacy by proposing changes to the Step 3, depicted above, where the client +This specification improves Reader Privacy by proposing changes to the Step 3, depicted above, where the Client supplies the content CID directly in order to lookup its corresponding providers. -In order to protect the reader's privacy the proposal changes the way CID lookup works to the following: - -* A client who wants to do a lookup will calculate a hash over the CID's multihash (`hash(MH)`) and use it for the -lookup query (hence the name double hashing); -* In response to the hashed find request, the indexer will return a set of encrypted `ProviderRecordKey`s. -`ProviderRecordKey` will consist of the `peerID` concatenated with `contextID`. -Each `ProviderRecordKey` will be encrypted with a key derived from the *original* multihash value: -`enc(peerID || contextID, MH)`, where `hash` is a hash over the value, and `||` is concatenation -and `enc` is encryption over the value. *This notation is going to be used for the rest of the specification*. -In order to make sense of that payload, a passive observer would need -to get hold of the original multihash that isn't revealed during the communication round; -* Using the original multihash, the client will decrypt `ProviderRecordKey`s and use -the `peerID` to fetch a `ProviderRecord`. `ProviderRecord` will contain the provider's -*multiaddrs* with some other possible provider-related information in the future. -`ProviderRecord`s are cacheable and hence this rountrip can be avoided most of the times. Peer addresses can -also be discovered through alternative sources such as libp2p peerstore ; +* A Client who wants to do a lookup will calculate `HASH2` and use it for the lookup query; +* In response to that IPNI will return a list of `EncProviderRecordKey`s that contain +encrypted `peerID` and `contextID` of the Storage Providers that have the content represented by the original `MH`. +In order to make sense of that payload, a Passive Observer would need +to get hold of the original `MH` that isn't revealed during this communication round; +* Using the original `MH`, the Client will decrypt `EncProviderRecordKey`s and use +the `peerID` to fetch a `ProviderRecord`. `ProviderRecord`s can be cached on the Client side and hence this rountrip can be avoided most of the times. +Peer addresses can also be discovered through alternative sources such as libp2p peerstore ; * Using addresses from the `ProviderRecord` the client will reach out to the provider directly and fetch the desired content; -* The client might choose to fetch additional `Metadata` that is supplied to IPNI in Advertisements. -That will require another lookup round by `hash(ProviderRecordKey)` to get `enc(Metadata, ProviderRecordKey)` in response. +* The client might choose to fetch IPNI Metadata that will require another lookup round by `HashProviderRecordKey` to get `EncMetadata` in response. By utilising such scheme only a party that knows the original CID can decode the protocol, which is never revealed. ```mermaid sequenceDiagram - participant client - participant indexer - participant provider - client->>client: calculates hash(MH) - client->>indexer: sends a find request for hash(MH) - indexer->>client: sends a list of [ProviderRecordKey], each encrypted with a key derived from MH - loop ProviderRecordKeys - client->>client: decrypts ProviderRecordKey and extracts peerID from it - client->>indexer: sends ProviderRecord lookup request for peerID - indexer->>client: sends a ProviderRecord - client->>indexer: [Optional] sends Metadata lookup request for hash(ProviderRecordKey) - indexer->>client: [Optional] sends Metadata encrypted with a key derived from ProviderRecordKey - client->>client: [Optional] decrypts the Metadata + participant Client + participant IPNI + participant StorageProvider + Client->>Client: calculates HASH2 + Client->>IPNI: sends a find request for HASH2 + IPNI->>Client: sends a list of [EncProviderRecordKey] + loop EncProviderRecordKeys + Client->>Client: decrypts EncProviderRecordKey and extracts peerID from it + Client->>IPNI: sends ProviderRecord lookup request for peerID + IPNI->>Client: sends a ProviderRecord + Client->>IPNI: [Optional] sends EncMetadata lookup request for hash(ProviderRecordKey) + IPNI->>Client: [Optional] sends EncMetadata + Client->>Client: [Optional] decrypts the EncMetadata using ProviderRecordKey end - client->>provider: reaches out to the provider for the desired content + Client->>StorageProvider: reaches out to the Storage Provider for the desired content ``` ### Extended Providers @@ -136,15 +166,15 @@ or to a single Advertisement with a specific `ContextID`. That can be done by se If present Extended Providers are applied to the IPNI output on the server which results into more `ProviderRecord`s being returned to the user. Same will not be possible for privacy preserving lookups as the required fields such as `PeerID` and `ContextID` are opaque to the server. -While the mechanics stays the same, applying Extended Providers to the decrypted values will have to be done at the client side. If exist, Extended Providers should be included +While the mechanics stays the same, applying Extended Providers to the decrypted values will have to be done at the Client side. If exist, Extended Providers should be included as a field in the `ProviderRecord` which would make them cacheable too. ### Security ​ -Security model of the Reader Privacy proposal boils down to inability of an attacker to *algorithmically* derive the original multihash value from -`hash(multihash)` that is used for IPNI lookups. IPNI advertisments are not encrypted, but authenticated and contain plain multihash values in them. -Before Writer Privacy is implemented an attacker could build a map of `hash(multihash) -> multihash` -by re-ingesting Advertisements chain from each publisher in order to collect all original multihashes which can then be used to decrypt provider records and so on. +Security model of the Reader Privacy proposal boils down to inability of an attacker to *algorithmically* derive the original `MH` from +`HASH2` that is used for IPNI lookups. IPNI advertisments are not encrypted, but authenticated and contain plain multihash values in them. +Before Writer Privacy is implemented an attacker could build a map of `HASH2 -> MH` +by re-ingesting Advertisements chain from each Publisher in order to collect all original multihashes which can then be used to decrypt provider records and so on. Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. Even with both Reader and Writer Privacies in place a rogue IPNI actor might abuse the double-hashing security model. For example: @@ -157,30 +187,28 @@ Rogue IPNI behaviour will be addressed by IPNI reputation system that is out of Reader Privacy is a first step towards fully private content routing protocol. -Wider security implications are discussed in the IPFS Reader Privacy specification: TODO link here. +Wider security implications are discussed in the [IPFS Reader Privacy specification](https://github.com/guillaumemichel/specs/blob/double-hashing-dht/IPIP/0373-double-hash-dht.md): TODO update the link once the PR is merged. #### Hashing and Encryption Function Upgrades -All multihashes have a codec encoded in them. If a hashing or encryption funciton will have to rotate then different types of multihahses can coexist together +All multihashes contain a codec. If a hashing or encryption funciton will have to rotate then different types of multihahses can coexist together and can be processed differently by IPNI implementations. It won't be possible to apply a fix retroactivelly to the data returned by previous lookup requests, however IPNI implementations can start blocking all new ones that use a compromised scheme, allowing some transtition period. Moving an IPNI implementation to a new hash / encryption function will require reingesting all data from a scratch. Before Writer Privacy is impemented the -index can be migrated over to new functions by reingesting existing advertisement chains. With Writer Privacy, Publishers will have to republish advertisments -using new functions (as the data in the advertisements themselves will have to be re-hashed / re-encrypted). Both old and new scheme can coexist together for some time. +index can be migrated over to new functions by reingesting existing Advertisement chains. With Writer Privacy, Publishers will have to republish Advertisments +using new functions (as the data in the Advertisements themselves will have to be re-hashed / re-encrypted). Both old and new scheme can coexist together for some time. The old scheme should be retired either immediately or once the indexes have been rebuilt and the users have been migrated over. An exact operational procedure will be different for differnet IPNI implementations. ### Trade Offs -* **Multiple lookups**. In the simplest scenario Reader Privacy protocol will require at least two roundtrips to find provider details for a given CID. -It can be reduced down to one by caching `ProviderRecord`s at the client side, which would eliminate a need in a lookup -per decrypted `ProviderRecordKey`. In the future there can be a separate service that distributes `PeerID` to `Multiaddresses` mappings in open. -That dataset can be periodically downloaded by all clients and cached locally; +* **Multiple lookups**. In the best case scenario Reader Privacy protocol will require one roundtrip to get a list of peers for a given CID. +Worst case scenario, when both `ProviderRecord`s and `Metadata`s need to be fetched from IPNI, will require 3+ lookups. -* **Extra compute**. At minimum, clients will require to perform an extra hashing per CID and decryption per `ProviderRecordKey` that will add -some overhead to each lookup. That overhead can be initially offloaded to the server but will have to be done by clients eventually (explained below); +* **Extra compute**. At minimum, Clients will require to perform an extra hashing per CID and decryption per `ProviderRecordKey` that will add +some overhead to each lookup. * **Extra storage space**. Storing encrypted data will require more space due to paddings and nonce; @@ -196,54 +224,26 @@ IPNI implementations will have to serve both plain and hashed lookups. That will ### Threat Modelling -There are three actors involved into the IPNI workflow: provider, client and indexer. Providers update index by publishing advertisements. -Indexer advertisements are signed by their publishers and can be authenticated. Advertisements are organised in a chain and are ingested strictly in order. -It's not possible to reorder advertisements wihtout forking the chain. Advertisements processing is idempotent - re-ingesting the same advertismeent twice -shouldn't affect the indexer's state. The IPNI specification is agnostic to transport protocols so particular protocol choice is up to the implementation. -Compromised publisher's identity is out of scope of this specification. +There are three actors involved into the IPNI workflow: Publisher, Client and IPNI. Publishers update index by publishing Advertisements. +Advertisements are signed by their Publishers and can be authenticated. Advertisements are organised in a chain and are ingested strictly in order. +It's not possible to reorder Advertisements wihtout having to fork the chain. Advertisements processing is idempotent - re-ingesting the same Advertismeent twice +doesn't affect IPNI state. The IPNI specification is agnostic to transport protocols so particular protocol choice is up to the implementation. +Compromised Publisher's identity is out of scope of this specification. -Clients consume index by performing CID lookups. This specification introduces additional hashing and encryption that aim to prevent a passive observer -from being able to infer what data is being looked up by spying at the client to indexer traffic. The exact communicaton protocol is out of scope for this specification -however it should be chosen carefully to prevent MITM attacks. Before Writer Privacy upgrade a passive observer could deobfuscate the content by building a `hash(Multihash)` to `Multihash` map. +Clients consume index by performing CID lookups. This specification introduces additional hashing and encryption that aim to prevent a Passive Observer +from being able to infer what data is being looked up by spying at the Client to IPNI traffic. The exact communicaton protocol is out of scope for this specification +however it should be chosen carefully to prevent MITM attacks. Before Writer Privacy upgrade a Passive Observer could deobfuscate the content by building a `HASH2` to `MH` map. That however will not be possible eventually as mentioned in the [Security](#security) section. -Provider records returned to client lookups do not contain any authentication data. It's possible for a malicious / buggy indexer to -present a wrong dataset to the client. Clients can tackle that problem by excluding such indexers from their pool. Returning wrong datasets will -eventually affect the indexer's reputation score - these efforts are already in progress. Data integrity is inbuilt into IPFS - clients -can verify that the data returned by a storage provider matches the CID. So even if indexer get compromised that will not compromise +`EncProviderRecord`s do not contain any authentication data. It's possible for a malicious IPNI to +present a wrong dataset to the Client. Clients can tackle that by excluding such IPNIs from their pool. Returning wrong datasets will +eventually affect the IPNI's reputation score - these efforts are already in progress. Data integrity is inbuilt into IPFS - Clients +can verify that the data returned by a Storage Provider matches the CID. So even if an IPNI get compromised that will not compromise the data itself. -## Implementation - -This section goes through Reader Privacy protocol, hashing and cryptography choices as well as explains some less obvious details -for the implementations to consider. - -### Hashing - -SHA256 must be used as a hashing function. To avoid collisions between the hashes calculated from the same value but for different purposes (for example -to derive an encryption key versus derive a lookup key) a constant string can be appended to the value before calculating a hash over it. For example -`sha256("AESGCM" + multihash)` to derive an encryption key and `sha256("CR_DOUBLEHASH" + multihash)` to calculate a second hash over the multihash. - -All hashed data that is used for lookups must be of `Multihash` format with `SHA_256` codec. Double hashed data must use `DBL_SHA_256` codec. - -Multihashes must be prepended with `CR_DOUBLEHASH` before calculating a second hash. - -### Encryption - -AES in GCM mode must be used for encryption. AES keys must be 32-bytes long and be derived from the passpharse by calculating a SHA256 over it. AES keys must be prepended with `AESGCM` before hashing them over. -One particular detail - is a careful choice of 12-byte IV. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). -Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. -That means that the IV needs to be deterministically chosen so that `enc(IV, passphrase, payload)` produces the same output for the same -passpharase + payload pair. One strategy could be to deterministically derive an IV from the passphrase or to generate it randomly and store -the mapping on the client side. The IPNI specification doesn't enforce how an IV should be chosen and leaves that up to the client to decide. - -Encrypted payload must have nonce encoded in it: `nonce || enc(payload)`, where `nonce` is 12 bytes long. - ### Data Formats -* All binary data must be b58 encoded when transferred over the wire. -* `ProviderRecordKey`s must be created by concatenating binary `PeerID` with binary `ContextID`. There is no need for extra length / separators as they are -already encoded as a part of the `Multihash` format. +All binary data must be b58 encoded when transferred over the wire. ## Related Resources ​ From 453f531211b65ac87e0c6c4953ac61dd07fad026 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 3 Mar 2023 12:13:37 +0000 Subject: [PATCH 46/49] Spacing --- reader-privacy.md | 1 - 1 file changed, 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 2ae49b2..01af0c9 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -90,7 +90,6 @@ All salts below are 64-bytes long, and represent a string padded with `\x00`. - `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` - `SALT_NONCE = bytes("CR_NONCE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` - ## Definitions - **Advertisement** is [IPNI Advertisement](https://github.com/ipni/storetheindex/blob/main/api/v0/ingest/schema/schema.ipldsch#L40). From 033307c37effdd1275644f29c851f85df6b2814e Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Fri, 3 Mar 2023 14:58:23 +0000 Subject: [PATCH 47/49] Fix typos --- reader-privacy.md | 57 ++++++++++++++++++++++++----------------------- 1 file changed, 29 insertions(+), 28 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 01af0c9..96f3f1b 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -75,9 +75,9 @@ fully private IPNI protocol that will eliminate indexers as centralised observer ​ ## Background ​ -Network indexers build their indexes by ingesting chains of Advertisements. Advertisement is a +IPNI builds its indexes by ingesting chains of Advertisements. Advertisement is a construct that allows Storage Providers to publish their CIDs in bulk (FIL deals) instead of doing -that individually for each CID. A group of CIDs is represented by a unique ContextID as can be seen +that individually for each CID. A group of CIDs is represented by a ContextID that is unique per provider as can be seen on the diagram below: ​ ![Index building flow](resources/readers-privacy-1.png) @@ -90,22 +90,23 @@ All salts below are 64-bytes long, and represent a string padded with `\x00`. - `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` - `SALT_NONCE = bytes("CR_NONCE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` -## Definitions + +## Definitions - **Advertisement** is [IPNI Advertisement](https://github.com/ipni/storetheindex/blob/main/api/v0/ingest/schema/schema.ipldsch#L40). -- **Storage Provider** is a party who stores the data. +- **Storage Provider** is a party who stores the data and wants that data to be discoverable through IPNI. - **Publisher** is a party who publishes CIDs into IPNI on behalf of a Storage Provider. -- **Client** is a party who wants to find the content by its CID using IPNI. -- **Passive Observer** a rogue party that tries to understand what content is being looked up by observing Client to IPNI traffic. -- **`enc`** is [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. +- **Client** is a party who wants to find the content by its CID using IPNI for the purpose of retreiving from the Storage Provider. +- **Passive Observer** is a rogue party that wants to understand what content is being looked up by observing Client-to-IPNI traffic. +- **`enc`** is [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. The following notation will be used for the rest of the specification `enc(passphrase, nonce, payload)`. - **`hash`** is [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. - **`||`** is concatenation of two values. -- **`deriveKey`** is a process for deriving 32-byte encryption key from a passphrase taht must be done as `hash(SALT_ENCRYPTIONKEY || passphrase)`. -- **`Nonce`** is a 12-byte Nonce used as Initialization Vector (IV) for the AES-GCM encryption. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). +- **`deriveKey`** is deriving a 32-byte encryption key from a passphrase that is done as `hash(SALT_ENCRYPTIONKEY || passphrase)`. +- **`Nonce`** is a 12-byte nonce used as Initialization Vector (IV) for the AESGCM encryption. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. -That means that the nonce has to be deterministically chosen so that `enc(IV, passphrase, payload)` produces the same output for the same -`passpharase` + `payload` pair. Nonce must be calculated as `sha256(SALT_NONCE || passphrase || len(payload) || payload)[:12]`, where `len(payload)` is -an 8 byte length of the `payload` encoded in Little Endian format. Choice of nonce is not enforced by the IPNI specification. The described approach will +That means that the nonce has to be deterministically chosen so that `enc(passphrase, nonce, payload)` produces the same output for the same +`passpharase` + `payload` pair. Nonce must be calculated as `hash(SALT_NONCE || passphrase || len(payload) || payload)[:12]`, where `len(payload)` is +an 8-byte length of the `payload` encoded in Little Endian format. Choice of nonce is not enforced by the IPNI specification. The described approach will be used while IPNI encrypts Advertisements on behaf of Publishers. However once Writer Privacy is implemented, the choice of nonce will be left up to the Publisher. - **`CID`** is the [Content IDentifier](https://github.com/multiformats/cid). - **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the @@ -113,8 +114,8 @@ digest of a hash function over some content. `MH` is represented as a 32-byte ar - **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. - **`ProviderRecord`** is a data structure that contains such information about Storage Provider as ther PeerID and Addresses. -- **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for extra length / separators as they are -already encoded as a part of the `Multihash` format. +- **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are +already encoded as a part of the multihash format. - **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. - **`HashProviderRecordKey`** is a hash over `ProviderRecordKey` that must be calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. - **`Metadata`** is IPNI metadata that is supplied in Advertisements. @@ -123,7 +124,7 @@ already encoded as a part of the `Multihash` format. ## Specification ​ This specification improves Reader Privacy by proposing changes to the Step 3, depicted above, where the Client -supplies the content CID directly in order to lookup its corresponding providers. +supplies the CID to IPNI in order to lookup corresponding Storage Providers. * A Client who wants to do a lookup will calculate `HASH2` and use it for the lookup query; * In response to that IPNI will return a list of `EncProviderRecordKey`s that contain @@ -133,8 +134,8 @@ to get hold of the original `MH` that isn't revealed during this communication r * Using the original `MH`, the Client will decrypt `EncProviderRecordKey`s and use the `peerID` to fetch a `ProviderRecord`. `ProviderRecord`s can be cached on the Client side and hence this rountrip can be avoided most of the times. Peer addresses can also be discovered through alternative sources such as libp2p peerstore ; -* Using addresses from the `ProviderRecord` the client will reach out to the provider directly and fetch the desired content; -* The client might choose to fetch IPNI Metadata that will require another lookup round by `HashProviderRecordKey` to get `EncMetadata` in response. +* Using addresses from the `ProviderRecord` the Client will reach out to the Storage Provider directly and fetch the desired content; +* The Client might choose to fetch IPNI Metadata that will require another lookup round by `HashProviderRecordKey` to get `EncMetadata` in response. By utilising such scheme only a party that knows the original CID can decode the protocol, which is never revealed. @@ -149,8 +150,8 @@ sequenceDiagram IPNI->>Client: sends a list of [EncProviderRecordKey] loop EncProviderRecordKeys Client->>Client: decrypts EncProviderRecordKey and extracts peerID from it - Client->>IPNI: sends ProviderRecord lookup request for peerID - IPNI->>Client: sends a ProviderRecord + Client->>IPNI: [Optional] sends ProviderRecord lookup request for peerID + IPNI->>Client: [Optional] sends a ProviderRecord Client->>IPNI: [Optional] sends EncMetadata lookup request for hash(ProviderRecordKey) IPNI->>Client: [Optional] sends EncMetadata Client->>Client: [Optional] decrypts the EncMetadata using ProviderRecordKey @@ -160,7 +161,7 @@ sequenceDiagram ### Extended Providers -[Extended Providers](https://github.com/ipni/specs/blob/main/IPNI.md#extendedprovider) allow a publisher to add an extra information to all their past and future Advertisements +[Extended Providers](https://github.com/ipni/specs/blob/main/IPNI.md#extendedprovider) allow a Publisher to add an extra information to all their past and future Advertisements or to a single Advertisement with a specific `ContextID`. That can be done by sending just a single Advertisement without having to re-publish the whole Advertisement chain. If present Extended Providers are applied to the IPNI output on the server which results into more `ProviderRecord`s being returned to the user. Same will not be possible for privacy preserving lookups as the required fields such as `PeerID` and `ContextID` are opaque to the server. @@ -170,13 +171,13 @@ as a field in the `ProviderRecord` which would make them cacheable too. ### Security ​ -Security model of the Reader Privacy proposal boils down to inability of an attacker to *algorithmically* derive the original `MH` from -`HASH2` that is used for IPNI lookups. IPNI advertisments are not encrypted, but authenticated and contain plain multihash values in them. -Before Writer Privacy is implemented an attacker could build a map of `HASH2 -> MH` -by re-ingesting Advertisements chain from each Publisher in order to collect all original multihashes which can then be used to decrypt provider records and so on. -Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by *Writer Privacy* upgrade. +Security model of the Reader Privacy proposal boils down to inability of a Passive Observer to *algorithmically* derive the original `MH` from +`HASH2` that is used for IPNI lookups. IPNI Advertisments are not encrypted, but authenticated and contain plain multihash values in them. +Before Writer Privacy is implemented a Passive Observer could build a map of `HASH2 -> MH` +by re-ingesting Advertisements chain from each Publisher in order to collect all original multihashes which can then be used to decrypt `EncProviderRecord`s and so on. +Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by Writer Privacy upgrade. -Even with both Reader and Writer Privacies in place a rogue IPNI actor might abuse the double-hashing security model. For example: +Even with both Reader and Writer Privacies in place a rogue IPNI actor might abuse this security model. For example: * Someone wants to detect who is looking for a particular piece of content, i.e. surveilling content. For example, an IPNI endpoint that wants to know how frequently people are requesting some website it cares about; * Someone wants to do mass surveillance on readily accessible data. For example, a group running an IPNI endpoint also runs web crawlers looking for IPFS links, @@ -223,14 +224,14 @@ IPNI implementations will have to serve both plain and hashed lookups. That will ### Threat Modelling -There are three actors involved into the IPNI workflow: Publisher, Client and IPNI. Publishers update index by publishing Advertisements. +There are three actors involved into the IPNI workflow: Publisher, Client and IPNI. Publishers makes update to indexes by publishing Advertisements. Advertisements are signed by their Publishers and can be authenticated. Advertisements are organised in a chain and are ingested strictly in order. It's not possible to reorder Advertisements wihtout having to fork the chain. Advertisements processing is idempotent - re-ingesting the same Advertismeent twice doesn't affect IPNI state. The IPNI specification is agnostic to transport protocols so particular protocol choice is up to the implementation. Compromised Publisher's identity is out of scope of this specification. Clients consume index by performing CID lookups. This specification introduces additional hashing and encryption that aim to prevent a Passive Observer -from being able to infer what data is being looked up by spying at the Client to IPNI traffic. The exact communicaton protocol is out of scope for this specification +from being able to infer what data is being looked up by spying at the Client-to-IPNI traffic. The exact communicaton protocol is out of scope for this specification however it should be chosen carefully to prevent MITM attacks. Before Writer Privacy upgrade a Passive Observer could deobfuscate the content by building a `HASH2` to `MH` map. That however will not be possible eventually as mentioned in the [Security](#security) section. From 896ff3d5987d8b4b20d934bf20ecc93554da3b85 Mon Sep 17 00:00:00 2001 From: gammazero Date: Tue, 19 Mar 2024 00:50:18 -0700 Subject: [PATCH 48/49] Update text, links, etc. --- reader-privacy.md | 94 ++++++++++++++++++++--------------------------- 1 file changed, 40 insertions(+), 54 deletions(-) diff --git a/reader-privacy.md b/reader-privacy.md index 96f3f1b..5a48cba 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -30,10 +30,9 @@ reader privacy is amplified. This makes IPNI a difficult choice as an alternativ projects such as IPFS, which use a more decentrailsed routing system that by nature reduces the possibility of mass query snooping. ​ -There is ongoing work on IPFS side to integrate a reader privacy technique, a.k.a, double hashing. -Building on top of the existing approach, this document specifies how a similar technique is applied -to IPNI in order to preserve the reader's privacy while continuing to facilitate low-latency -provider lookup. +Building on top of IPFS's reader privacy mechanism, a.k.a, double-hashing, this document specifies how +a similar technique is applied to IPNI in order to preserve the reader's privacy while continuing to +facilitate low-latency provider lookup. ​ ## Table of Contents @@ -55,17 +54,16 @@ For more technical implementation details please see the [Addendum](reader-priva ​ ## Introduction ​ -IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack -of privacy protections for the content routing subsystem. Currently neither readers (clients accessing files) -nor writers (hosts storing and distributing content) have much privacy with regard to content they publish or -consume. It is very easy for a content router node or a passive observer to learn which file is requested by -which client during the routing process, as the potential adversary easily learns about the requested `CID`. -A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. -This is obviously undesirable and has been for some time now a strong request from the community. - -The changes described in this specification introduce a IPNI Readers Privacy upgrade. It will prevent -passive observers from tracking user's actions as described above. It will also be a first step towards -fully private IPNI protocol that will eliminate indexers as centralised observers. +IPFS requires various privacy protections, which includes privacy for the content routing subsystem. Readers +(clients accessing files) need privacy for the content they consume and writers (hosts storing and +distributing content) need privacy for the content they publish. It is very easy for a content router node or +a passive observer to learn which file is requested by which client during the routing process, as the +potential adversary easily learns about the requested `CID`. A snooping actor could request the same `CID` +and download the associated file to monitor the user’s behavior. This is obviously undesirable and has been +for some time now a strong request from the community. + +This specification describes how IPNI provides Readers Privacy. It prevents passive observers from tracking a user's +actions as described above. ### Non Goals @@ -93,11 +91,11 @@ All salts below are 64-bytes long, and represent a string padded with `\x00`. ## Definitions -- **Advertisement** is [IPNI Advertisement](https://github.com/ipni/storetheindex/blob/main/api/v0/ingest/schema/schema.ipldsch#L40). +- **Advertisement** is [IPNI Advertisement](https://github.com/ipni/go-libipni/blob/main/ingest/schema/schema.ipldsch#L40). - **Storage Provider** is a party who stores the data and wants that data to be discoverable through IPNI. -- **Publisher** is a party who publishes CIDs into IPNI on behalf of a Storage Provider. +- **Publisher** is a party who publishes multihashes, via Advertisements, to IPNI on behalf of a Storage Provider. - **Client** is a party who wants to find the content by its CID using IPNI for the purpose of retreiving from the Storage Provider. -- **Passive Observer** is a rogue party that wants to understand what content is being looked up by observing Client-to-IPNI traffic. +- **Passive Observer** is a snooping party that wants to understand what content is being looked up by observing Client-to-IPNI traffic. - **`enc`** is [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. The following notation will be used for the rest of the specification `enc(passphrase, nonce, payload)`. - **`hash`** is [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. - **`||`** is concatenation of two values. @@ -106,16 +104,12 @@ All salts below are 64-bytes long, and represent a string padded with `\x00`. Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. That means that the nonce has to be deterministically chosen so that `enc(passphrase, nonce, payload)` produces the same output for the same `passpharase` + `payload` pair. Nonce must be calculated as `hash(SALT_NONCE || passphrase || len(payload) || payload)[:12]`, where `len(payload)` is -an 8-byte length of the `payload` encoded in Little Endian format. Choice of nonce is not enforced by the IPNI specification. The described approach will -be used while IPNI encrypts Advertisements on behaf of Publishers. However once Writer Privacy is implemented, the choice of nonce will be left up to the Publisher. +an 8-byte length of the `payload` encoded in Little Endian format. Choice of nonce is not enforced by the IPNI specification. The described approach will be used while IPNI encrypts Advertisements on behaf of Publishers. However once Writer Privacy is implemented, the choice of nonce will be left up to the Publisher. - **`CID`** is the [Content IDentifier](https://github.com/multiformats/cid). -- **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the -digest of a hash function over some content. `MH` is represented as a 32-byte array. -- **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. -The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. +- **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the digest of a hash function over some content. `MH` is represented as a 32-byte array. +- **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. - **`ProviderRecord`** is a data structure that contains such information about Storage Provider as ther PeerID and Addresses. -- **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are -already encoded as a part of the multihash format. +- **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are already encoded as a part of the multihash format. - **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. - **`HashProviderRecordKey`** is a hash over `ProviderRecordKey` that must be calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. - **`Metadata`** is IPNI metadata that is supplied in Advertisements. @@ -123,17 +117,15 @@ already encoded as a part of the multihash format. ## Specification ​ -This specification improves Reader Privacy by proposing changes to the Step 3, depicted above, where the Client -supplies the CID to IPNI in order to lookup corresponding Storage Providers. +This specification improves Reader Privacy by making changes to the Step 3, depicted above, where the Client +supplies the multihash (or CID) to IPNI in order to lookup corresponding Storage Providers. * A Client who wants to do a lookup will calculate `HASH2` and use it for the lookup query; -* In response to that IPNI will return a list of `EncProviderRecordKey`s that contain -encrypted `peerID` and `contextID` of the Storage Providers that have the content represented by the original `MH`. +* In response to that IPNI will return a list of `EncProviderRecordKey`s that containencrypted `peerID` and `contextID` of the Storage Providers that have the content represented by the original `MH`. In order to make sense of that payload, a Passive Observer would need to get hold of the original `MH` that isn't revealed during this communication round; -* Using the original `MH`, the Client will decrypt `EncProviderRecordKey`s and use -the `peerID` to fetch a `ProviderRecord`. `ProviderRecord`s can be cached on the Client side and hence this rountrip can be avoided most of the times. -Peer addresses can also be discovered through alternative sources such as libp2p peerstore ; +* Using the original `MH`, the Client decrypts `EncProviderRecordKey`s and uses the `peerID` to fetch a `ProviderRecord`. `ProviderRecord`s can be cached on the Client side and hence this rountrip can be avoided most of the times. +Peer addresses can also be discovered through alternative sources such as libp2p peerstore; * Using addresses from the `ProviderRecord` the Client will reach out to the Storage Provider directly and fetch the desired content; * The Client might choose to fetch IPNI Metadata that will require another lookup round by `HashProviderRecordKey` to get `EncMetadata` in response. @@ -175,7 +167,7 @@ Security model of the Reader Privacy proposal boils down to inability of a Passi `HASH2` that is used for IPNI lookups. IPNI Advertisments are not encrypted, but authenticated and contain plain multihash values in them. Before Writer Privacy is implemented a Passive Observer could build a map of `HASH2 -> MH` by re-ingesting Advertisements chain from each Publisher in order to collect all original multihashes which can then be used to decrypt `EncProviderRecord`s and so on. -Doing that will require significant resources as it involves crawling the entire network. However, it will eventually be eliminated by Writer Privacy upgrade. +Doing that will require significant resources as it involves crawling the entire network. Eliminating this requires a Writer Privacy upgrade. Even with both Reader and Writer Privacies in place a rogue IPNI actor might abuse this security model. For example: * Someone wants to detect who is looking for a particular piece of content, i.e. surveilling content. For example, an IPNI endpoint that wants to know how @@ -183,7 +175,7 @@ frequently people are requesting some website it cares about; * Someone wants to do mass surveillance on readily accessible data. For example, a group running an IPNI endpoint also runs web crawlers looking for IPFS links, or runs a public HTTP gateway and can log those requests. -Rogue IPNI behaviour will be addressed by IPNI reputation system that is out of scope for this specification. +Rogue IPNI behaviour will be addressed by IPNI writer privacy and reputation system that is out of scope for this specification. Reader Privacy is a first step towards fully private content routing protocol. @@ -192,10 +184,10 @@ Wider security implications are discussed in the [IPFS Reader Privacy specificat #### Hashing and Encryption Function Upgrades All multihashes contain a codec. If a hashing or encryption funciton will have to rotate then different types of multihahses can coexist together -and can be processed differently by IPNI implementations. It won't be possible to apply a fix retroactivelly to the data returned by previous lookup requests, +and can be processed differently by IPNI implementations. It will not be possible to apply a fix retroactivelly to the data returned by previous lookup requests, however IPNI implementations can start blocking all new ones that use a compromised scheme, allowing some transtition period. -Moving an IPNI implementation to a new hash / encryption function will require reingesting all data from a scratch. Before Writer Privacy is impemented the +Moving an IPNI implementation to a new hash / encryption function will require reingesting all data from the beginning of its publication. Before Writer Privacy is impemented the index can be migrated over to new functions by reingesting existing Advertisement chains. With Writer Privacy, Publishers will have to republish Advertisments using new functions (as the data in the Advertisements themselves will have to be re-hashed / re-encrypted). Both old and new scheme can coexist together for some time. The old scheme should be retired either immediately or once the indexes have been rebuilt and the users have been migrated over. @@ -207,20 +199,17 @@ An exact operational procedure will be different for differnet IPNI implementati * **Multiple lookups**. In the best case scenario Reader Privacy protocol will require one roundtrip to get a list of peers for a given CID. Worst case scenario, when both `ProviderRecord`s and `Metadata`s need to be fetched from IPNI, will require 3+ lookups. -* **Extra compute**. At minimum, Clients will require to perform an extra hashing per CID and decryption per `ProviderRecordKey` that will add +* **Extra compute**. At minimum, Clients must perform an extra hash computation per CID and decryption per `ProviderRecordKey` that will add some overhead to each lookup. -* **Extra storage space**. Storing encrypted data will require more space due to paddings and nonce; +* **Extra storage space**. Storing encrypted data will require more space due to padding and nonce; -* **Bulk deletes**. Encrypted `PeerID` will be different for each multihash and hence bulk delete operations (delete everything for a provider X) will not be possible. +* **Bulk deletes**. Encrypted `PeerID` will be different for each multihash and hence bulk delete operations (delete everything for a provider X) will not be possible. Such deletion will require a garbage collection mechanism that rereads deleted advertisements and deletes the HASH2 for all multihashes. * **Operational overhead**. Reader Privacy roll out will be a gradual process as many clients will have to migrate over. During the transition period -IPNI implementations will have to serve both plain and hashed lookups. That will involve either: - - spinning up separate IPNI instances for hashed queries or; - - serving hashed and regular queries from the same instances using encrypted dataset. That means that servers will have to do decryption on behalf of their clients - using a plain multihash that has been provided in the lookup request; +IPNI implementations will have to serve both plain and hashed lookups. That will involve either serving hashed and regular queries from the same instances. Servers will have to do decryption on behalf of their clients using a plain multihash that has been provided in the lookup request; -* **Data Migration**. Existing indexes will have to undergo data migration or to be wiped out complletely and rebuilt again. +* **Data Migration**. Existing indexes will have to re-ingest all index content that they want to provide Reader Privacy for. ### Threat Modelling @@ -230,16 +219,13 @@ It's not possible to reorder Advertisements wihtout having to fork the chain. Ad doesn't affect IPNI state. The IPNI specification is agnostic to transport protocols so particular protocol choice is up to the implementation. Compromised Publisher's identity is out of scope of this specification. -Clients consume index by performing CID lookups. This specification introduces additional hashing and encryption that aim to prevent a Passive Observer -from being able to infer what data is being looked up by spying at the Client-to-IPNI traffic. The exact communicaton protocol is out of scope for this specification -however it should be chosen carefully to prevent MITM attacks. Before Writer Privacy upgrade a Passive Observer could deobfuscate the content by building a `HASH2` to `MH` map. -That however will not be possible eventually as mentioned in the [Security](#security) section. +Clients consume index by performing CID (multihash) lookups. Additional hashing and encryption aims to prevent a Passive Observer +from being able to infer what data is being looked up by spying at the Client-to-IPNI traffic. Withouty a writer privacy solution, a malicious indexer that keeps a map of HASH2->MH could expose a client. Therefore communicaton protocol between client and IPNI should be chosen carefully to prevent MITM attacks. -`EncProviderRecord`s do not contain any authentication data. It's possible for a malicious IPNI to +`EncProviderRecord`s do not contain any authentication data. It is possible for a malicious IPNI to present a wrong dataset to the Client. Clients can tackle that by excluding such IPNIs from their pool. Returning wrong datasets will -eventually affect the IPNI's reputation score - these efforts are already in progress. Data integrity is inbuilt into IPFS - Clients -can verify that the data returned by a Storage Provider matches the CID. So even if an IPNI get compromised that will not compromise -the data itself. +eventually affect the IPNI's reputation score when a reputation tracking system is available. Data integrity is built into IPFS - Clients +can verify that the data returned by a Storage Provider matches the CID. So even if an IPNI is compromised the data itself is not compromised. ### Data Formats @@ -255,4 +241,4 @@ TODO: link to corresponding IPFS spec once materialised. ​ ## Copyright ​ -Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). \ No newline at end of file +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From 7090ce06789ff833cc08710bed13242852bf4772 Mon Sep 17 00:00:00 2001 From: gammazero Date: Tue, 19 Mar 2024 00:55:52 -0700 Subject: [PATCH 49/49] Update storage space overhead --- reader-privacy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reader-privacy.md b/reader-privacy.md index 5a48cba..9984982 100644 --- a/reader-privacy.md +++ b/reader-privacy.md @@ -202,7 +202,7 @@ Worst case scenario, when both `ProviderRecord`s and `Metadata`s need to be fetc * **Extra compute**. At minimum, Clients must perform an extra hash computation per CID and decryption per `ProviderRecordKey` that will add some overhead to each lookup. -* **Extra storage space**. Storing encrypted data will require more space due to padding and nonce; +* **Extra storage space**. Storing encrypted data will require more space due to padding, nonce, and the addition of an encrypted provider key per multihash; * **Bulk deletes**. Encrypted `PeerID` will be different for each multihash and hence bulk delete operations (delete everything for a provider X) will not be possible. Such deletion will require a garbage collection mechanism that rereads deleted advertisements and deletes the HASH2 for all multihashes.