Solr: Index all performance is too slow with full production data.

---

Author Name: **Kevin Condon** (@kcondon)
Original Redmine Issue: 3457, https://redmine.hmdc.harvard.edu/issues/3457
Original Date: 2014-01-29
Original Assignee: Philip Durbin

---

Preliminary testing shows index all is taking too long with full production data.

Indexing 1861 dataverses: 41 minutes 

Indexing 1900 datasets: 2 hours, 15 minutes. There are 52,000+ datasets.

The above numbers were achieved on dvn-3 with full production data of public dv's and studies. Various glassfish heaps of 512MB and 10GB showed the same performance.

---

We see"java -server -jar start.jar" at https://cwiki.apache.org/confluence/display/solr/Distributed+Search+with+Index+Sharding

-server? What does that mean?

`man java` says this...

```
   -server             Selects the Java  HotSpot  Server  VM.   For  more  information  see
                       Server-Class             Machine             Detection            at
                       http://java.sun.com/j2se/1.5.0/docs/guide/vm/server-class.html
```

... and if you follow that link you see this:

"Starting with J2SE 5.0, when an application starts up, the launcher can attempt to detect whether the application is running on a "server-class" machine and, if so, use the Java HotSpot Server Virtual Machine (server VM) instead of the Java HotSpot Client Virtual Machine (client VM). The aim is to improve performance even if no one configures the VM to reflect the application it's running. In general, the server VM starts up more slowly than the client VM, but over time runs more quickly."

Maybe this can help performance?

---

Related issue(s): #623
Redmine related issue(s): [3430](https://redmine.hmdc.harvard.edu/issues/3430), [4062](https://redmine.hmdc.harvard.edu/issues/4062)

---


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solr: Index all performance is too slow with full production data. #50

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Solr: Index all performance is too slow with full production data. #50

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions