6665 index dv api fails by sekmiller · Pull Request #6704 · IQSS/dataverse

sekmiller · 2020-02-27T18:16:34Z

What this PR does / why we need it:
This makes the indexing of linked dataverses more efficient by modifying the solr docs of the owned datasets instead of doing a full re-index. When the datasets were fully re-indexed dataverses with large numbers of datasets would fail to re-index.

Which issue(s) this PR closes:

Closes #6665

Special notes for your reviewer:

Suggestions on how to test this:
verify that a dataverse with a large number of datasets (mra - for one) can be re-indexed via the api. Also verify that linking dataverses also display the datasets of the linked dataverse.

Does this PR introduce a user interface change?:
no
Is there a release notes update needed for this change?:
none
Additional documentation:

coveralls · 2020-02-27T18:27:09Z

Coverage decreased (-0.005%) to 19.438% when pulling 7856538 on 6665-index-dv-api-fails into fea57c1 on develop.

kcondon · 2020-03-03T17:37:51Z

@sekmiller It looks like batch index performance became a lot slower: from 18hours to 6days, estimated. Will do further testing to see whether it is a completion rate issue versus a memory/resource problem, versus a few problem datasets.

Update: Ran it again , starting at 2pm on 3/3. It is still running but only 26k of 95k datasets indexed and appears to be 4 mins between indexing a dataset. CPU ~98%, mem ok, 30% used, ~3500 open file descriptors. This increasingly slow to index behavior feels like an algorithm issue where it is reprocessing an entire list that gets slower as list gets larger. We've seen it in other batch jobs in the past. Just speculation though.

sekmiller added 3 commits February 24, 2020 15:07

Merge branch 'develop' into 6665-index-dv-api-fails

efa5003

#6665 update ds solr docs directly

6663394

Merge branch 'develop' into 6665-index-dv-api-fails

5ea7a52

pdurbin approved these changes Feb 27, 2020

View reviewed changes

sekmiller added 2 commits March 2, 2020 16:56

#6665 add paths to files on index dataverse

a94b391

Merge branch 'develop' into 6665-index-dv-api-fails

40de0a0

kcondon assigned kcondon and sekmiller and unassigned kcondon Mar 3, 2020

sekmiller added 8 commits March 4, 2020 11:14

Merge branch 'develop' into 6665-index-dv-api-fails

27370a9

#6665 remove path update from index all

1085a15

Merge branch 'develop' into 6665-index-dv-api-fails

6617e4a

Merge branch 'develop' into 6665-index-dv-api-fails

3af998d

#6665 removing variable metadata process for benchmarking

4f321b8

#6665 add debug lines for benchmarking with var metadata processing

1809eb0

Merge branch 'develop' into 6665-index-dv-api-fails

9cd8817

#6665 remove variable metadata indexing and debug statements

7856538

sekmiller removed their assignment Mar 12, 2020

kcondon self-assigned this Mar 12, 2020

kcondon merged commit c574792 into develop Mar 12, 2020

kcondon deleted the 6665-index-dv-api-fails branch March 12, 2020 20:56

This was referenced Mar 13, 2020

6545 solr var meta #6577

Merged

Solr search and variable level metadata #6545

Closed

djbrooke added this to the 4.20 milestone Mar 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

6665 index dv api fails#6704

6665 index dv api fails#6704
kcondon merged 13 commits intodevelopfrom
6665-index-dv-api-fails

sekmiller commented Feb 27, 2020

Uh oh!

coveralls commented Feb 27, 2020 •

edited

Loading

Uh oh!

kcondon commented Mar 3, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

sekmiller commented Feb 27, 2020

Uh oh!

coveralls commented Feb 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kcondon commented Mar 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

coveralls commented Feb 27, 2020 •

edited

Loading

kcondon commented Mar 3, 2020 •

edited

Loading