Skip to content

enforce protection of OriginalFile.repo property via database triggers#5273

Merged
mtbc merged 27 commits intoome:rolesfrom
mtbc:roles-file-repo
May 11, 2017
Merged

enforce protection of OriginalFile.repo property via database triggers#5273
mtbc merged 27 commits intoome:rolesfrom
mtbc:roles-file-repo

Conversation

@mtbc
Copy link
Copy Markdown
Member

@mtbc mtbc commented May 3, 2017

What this PR does

Now that OriginalFile.repo is mapped in the OMERO model (see ome/omero-documentation#1623) we prevent this property being easily mutated because there are security implications that would otherwise require careful analysis. The repository DAO does have to be able to set the property when registering an original file in the repository. Initially I was having it drop to SQL to effect this but making a direct database change then re-merging that change into the OriginalFile Hibernate proxy impacted performance, slowing plate import.

Experimenting with BasicACLVoter.allowCreation and OmeroInterceptor methods like onFlushDirty and onSave failed to find any useful combination of calls that would allow me to detect if the repo property had been changed in a situation where I could also be reassured that the setter was trusted.

This PR introduces a hackish solution that has the twin virtues of actually working and being faster. Maybe we can find a better way but that can wait: I am putting time and behavior ahead of loveliness. The solution is: establish a secret key when the server starts up, tell it to the database then, when we try to set the originalfile.repo column, have a database trigger look for that secret key prefixing originalfile.name: if that prefix is there, then remove it and allow the change. Outside users, such as clients using the update service, have no way to access the secret key.

Testing this PR

CI should show no regressions and testFileRepoProperty*() in https://10.0.51.154:8443/job/OMERO-test-integration/lastCompletedBuild/testngreports/integration/UpdateServiceTest/ should pass. Comparing import times should reveal a small speedup, especially with significantly less time spent in createOriginalFile. Profiling the new triggers also suggests little time spent in those.

Related reading

https://trello.com/c/l4SERvRy

@jburel jburel added the roles label May 3, 2017
@joshmoore
Copy link
Copy Markdown
Member

It's an interesting solution. One minor and one longer term point:

  • Each node already has a protected UUID in spring that you could use if you wanted. In the standard configuration, it's also already stored in the DB under the "node" table. Non-root users should not be able to access it. (Note: in Chris' read-only branch, the nodes are not stored in the DB, but there no changes are intended to the DB anyway :) )
  • Relatedly, having a VM-level secret may hinder us from having multiple servers talking to the same DB down the road. At the moment, most effort is going into having master-slave-style behavior but as we move to micro-service-y architectures, this will become an issue.

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented May 5, 2017

Thank you: I shall see if I can make sense of the node table!

@joshmoore
Copy link
Copy Markdown
Member

Does the loop over nodes cause a performance hit?

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented May 5, 2017

Good question, I'm measuring performance of other stuff on another branch at the moment. 😃 My node tables might usually be too small to show the hit. Might it make sense to index node.down? How large might the table be getting in an older system, and with how many non-NULL node.down? (Pity PostgreSQL doesn't make it easier to partition node on that predicate.)

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented May 5, 2017

select count(*) from node;
select count(*) from node where down is null;

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented May 6, 2017

Nightshade's nightly-pg_dump_omero-20160808-1608.sql.gz has a node table with 149 rows, only one of them with a NULL down. I'll create an index on that column then profile. (Incidentally, the earliest row is from 2009-03-31.)

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented May 8, 2017

Useful in testing:

INSERT INTO node (id, conn, permissions, up, down, uuid)
    VALUES (nextval('seq_node'), 'fake', -52, now(), now(), CAST(random() AS TEXT));

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented May 8, 2017

Regarding #5273 (comment) performance testing is looking good to me locally with a couple of hundred nodes with a couple of them up.

@joshmoore
Copy link
Copy Markdown
Member

So I think the multi-VM issue potentially persists (i.e. if two nodes are running), though perhaps only if the nodes whose secret is being used goes down (?). Not saying that should block this PR, but something we should think about down the road. Otherwise, I'm interested to hear if you think this is now stable enough or if we will need to come back to review this solution. (i.e. is it still a hack?)

Regardless, the changes now look nicely smaller.

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented May 9, 2017

I think it should work okay with multiple nodes? Unless I messed up the logic in the triggers somehow.

I think stable enough for 5.4 anyway, not saying we might not want to come back to in in the longer term if we have a nicer idea.

@mtbc mtbc merged commit 3e01277 into ome:roles May 11, 2017
@mtbc mtbc deleted the roles-file-repo branch May 12, 2017 12:49
@sbesson sbesson added this to the 5.4.0 milestone Nov 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants