Conversation
some initial performance experiments including indexing and batching database operations
added IF NOT EXISTS to CREATE INDEX statements
…nd how performance can be improved for fetching OUs, Observations, and ObsVars
… searches Also did some code cleanup, and added a logback.xml config file
… problem. It was not needed. The ScaleVlalidValueCategories were present (even though it is set to fetch LAZY) because of the SQL like statemtents called on searchQuery
…ocker Create local containerized keycloak docker example, update README
jloux-brapi
left a comment
There was a problem hiding this comment.
There's a lot of stuff here. I'm hoping me, Pete, and BI devs who worked on this (or whoever is interested) can have a meeting in the new year and go through all these points together so we can start bucketing the stuff we want to keep, change, etc.
I also made some comments pertaining to the code. I'm not really expecting these to change or anything at this point, just pointing them out and want to see if we can refactor or change at some point.
| if(includeObservations) { | ||
| log.debug("Fetching observations for OUs"); | ||
| for(ObservationUnitEntity entity : page) { | ||
| log.trace("Fetching observations for OU: " + entity.getId()); | ||
| entity.getObservations(); |
There was a problem hiding this comment.
We are calling an accessor just to do go around the lazy loading? If so, this to me is not really the intent of what lazy loading should be used for. Lazy loading is to only fetch data when it is accessed by the app for business purposes. What this method appears to be doing though is returning a data set directly to a user, and actually in fact has very little business logic. I feel like instead of working around the lazy loading we should create a service/repo method to actually grab the data the user needs in one go. Doing it as written seems to create more DB connections than are necessary to carry out this transaction.
If in fact we expect to load this on a given record fetch (which is what this method seems to be), I'd argue the Observation entity attached to ObservationUnit should be eagerly loaded, not lazily loaded, depending on if there are other business use cases.
There was a problem hiding this comment.
I believe the business logic here is that includeObservations is a rare case. 90% of the time includeObservations will be false and this will avoid much of the accessor and mapping code needed to return Observations.
I agree with you that this could be redone with a more direct call to the database to gather the observations needed, rather than relying on Hibernate to load them.
Not BI specific logic, merge this and mark it for potential rework later
src/main/java/org/brapi/test/BrAPITestServer/service/pheno/ObservationUnitService.java
Show resolved
Hide resolved
src/main/java/org/brapi/test/BrAPITestServer/service/pheno/ObservationUnitService.java
Show resolved
Hide resolved
| if(!page.isEmpty()) { | ||
| observationRepository.fetchXrefs(page, ObservationEntity.class); | ||
| } |
There was a problem hiding this comment.
Why is this not in its own separate code path? Does it really need to depend on an already expensive query to determine its execution? Or can we create some diverging codepath or endpoint where we can execute this query independently. Need to know more about the use case here.
There was a problem hiding this comment.
I believe this is because Xrefs is actually part of the parent object, so for some reason it can't be accessed by the leftJoinFetch query builder above? so it has to be fetched after the fact, forcing an eager loading here, instead of loading each observation individually later when using the get methods.
It seems to me that if we need to avoid the lazy loading in a particular case, then it should all be done one way. Either all through the LEFT JOIN FETCH query, or using Hibernates own accessors like here....
Not BI specific, merge and mark for review with the rest of the lazy vs eager loading code
There was a problem hiding this comment.
Resolving due to revelations in this comment: #2 (comment)
src/main/java/org/brapi/test/BrAPITestServer/service/pheno/ObservationService.java
Show resolved
Hide resolved
| CREATE OR REPLACE FUNCTION sync_list_related_tables_soft_deleted() | ||
| RETURNS TRIGGER AS $$ | ||
| BEGIN | ||
| -- Update list_external_references | ||
| UPDATE public.list_external_references | ||
| SET soft_deleted = NEW.soft_deleted | ||
| WHERE list_entity_id = NEW.id; | ||
|
|
||
| -- Update list_item | ||
| UPDATE public.list_item | ||
| SET soft_deleted = NEW.soft_deleted | ||
| WHERE list_id = NEW.id; | ||
|
|
||
| RETURN NEW; | ||
| END; | ||
| $$ | ||
| LANGUAGE plpgsql; | ||
|
|
||
| -- Create a trigger on the list table | ||
| CREATE TRIGGER sync_soft_deleted_status | ||
| AFTER UPDATE OF soft_deleted ON public.list | ||
| FOR EACH ROW | ||
| WHEN (OLD.soft_deleted IS DISTINCT FROM NEW.soft_deleted) | ||
| EXECUTE FUNCTION sync_list_related_tables_soft_deleted(); |
There was a problem hiding this comment.
I'm gonna need a walkthrough on the purpose and need of this to determine if we need this in the prod server.
There was a problem hiding this comment.
This SQL migration script creates a trigger that synchronizes the soft_deleted status across related tables when a list is soft-deleted or restored. Here's a breakdown of its purpose:
-
Trigger Function:
The script defines a function calledsync_list_related_tables_soft_deleted()that performs the following actions:- Updates the
soft_deletedstatus in thelist_external_referencestable for all rows related to the updated list. - Updates the
soft_deletedstatus in thelist_itemtable for all rows related to the updated list.
- Updates the
-
Trigger Creation:
The script then creates a trigger namedsync_soft_deleted_statuson thepublic.listtable with the following characteristics:- It fires AFTER an UPDATE operation on the
soft_deletedcolumn of thelisttable. - It executes FOR EACH ROW affected by the update.
- It only fires WHEN the
soft_deletedstatus has actually changed (OLD.soft_deleted IS DISTINCT FROM NEW.soft_deleted).
- It fires AFTER an UPDATE operation on the
The purpose of this trigger is to maintain data consistency across related tables. When a list is soft-deleted or restored (by changing its soft_deleted status), the trigger ensures that all related records in the list_external_references and list_item tables are updated accordingly.
| ALTER TABLE public.list | ||
| ADD COLUMN soft_deleted BOOLEAN NOT NULL DEFAULT FALSE; | ||
|
|
||
| ALTER TABLE public.list_external_references | ||
| ADD COLUMN soft_deleted BOOLEAN NOT NULL DEFAULT FALSE; | ||
|
|
||
| ALTER TABLE public.list_item | ||
| ADD COLUMN soft_deleted BOOLEAN NOT NULL DEFAULT FALSE; |
There was a problem hiding this comment.
archived much better name/term than soft_deleted IMO.
There was a problem hiding this comment.
If clients could specify that they wanted archived data in responses with a parameter, I think archived makes more sense, however, if the soft-delete is transparent to the client, and soft-deleted data is never returned in responses, I think soft_deleted is a good name.
I thought the soft-delete approach was basically to ease concerns around user error causing data loss, but it would be great to have a documented objective or rationale for delete functionality in the BrAPI spec (assuming it eventually goes there).
| -- First, drop the existing foreign key constraint | ||
| ALTER TABLE ONLY public.vendor_file_sample | ||
| DROP CONSTRAINT IF EXISTS fke3tnyn895kve2kgixku4j7htb; | ||
|
|
||
| ALTER TABLE ONLY public.callset | ||
| DROP CONSTRAINT IF EXISTS fkhreq22htrftm3dul7nfsg1agk; |
There was a problem hiding this comment.
We should definitely consider keeping these cascades to the main branch of the prod server.
| public interface ListRepository extends BrAPIRepository<ListEntity, String>{ | ||
| public Page<ListEntity> findAllBySearchAndNotDeleted(SearchQueryBuilder<ListEntity> searchQuery, Pageable pageReq); | ||
|
|
||
| @Query("SELECT l FROM ListEntity l WHERE l.id = :id AND l.softDeleted = false") |
There was a problem hiding this comment.
Definitely think we want to keep this functionality (and similar elsewhere) in the prod server base branch.
| @Query("UPDATE TrialEntity t SET t.softDeleted = :softDeleted WHERE t.id = :trialId") | ||
| int updateSoftDeletedStatus(@Param("trialId") String trialId, @Param("softDeleted") boolean softDeleted); |
There was a problem hiding this comment.
I'm starting to get the feeling these soft queries could be generified. Not sure why we need to define them in every entity that needs them. I suppose maybe it's bc not every entity necessarily has this column, but I think we could employ some kind of repo inheritance for entities that need deletion, this repo could extend from BrAPIRepository, and it contains these soft deleted methods generified for any entity that needs them.
There was a problem hiding this comment.
I agree that any common definitions like this should be placed in a parent class or interface when adding delete functionality across all primary BrAPI entities. The only reason it was repeated for distinct implementations or classes here is that delete functionality was only being implemented for three primary BrAPI entities instead of across all primary entities.
|
Per @BrapiCoordinatorSelby I am merging this. It seems most of the code in this MR we do want in the prod branch, however we will have to make some new MRs to get code we don't want out and to refactor some things. We can do this by making issues of some of the more important comments that exist. |
No description provided.