One configurable cache per repository pool#464
One configurable cache per repository pool#464ajnavarro merged 5 commits intosrc-d:masterfrom kuba--:cache-440
repository pool#464Conversation
Signed-off-by: kuba-- <kuba@sourced.tech>
Signed-off-by: kuba-- <kuba@sourced.tech>
| Password string `short:"P" long:"password" default:"" description:"Password used for connection."` | ||
| PilosaURL string `long:"pilosa" default:"http://localhost:10101" description:"URL to your pilosa server." env:"PILOSA_ENDPOINT"` | ||
| IndexDir string `short:"i" long:"index" default:"/var/lib/gitbase/index" description:"Directory where the gitbase indexes information will be persisted." env:"GITBASE_INDEX_DIR"` | ||
| CacheSize cache.FileSize `long:"cache" default:"536870912" description:"Object cache size" env:"GITBASE_CACHE_SIZE"` |
There was a problem hiding this comment.
could you change the default to something more human-friendly? like 100 and allows only MB
There was a problem hiding this comment.
yeah sure, how about 512MB?
| func NewRepositoryPool(maxCacheSize cache.FileSize) *RepositoryPool { | ||
| return &RepositoryPool{ | ||
| repositories: make(map[string]repository), | ||
| cache: cache.NewObjectLRU(maxCacheSize), |
There was a problem hiding this comment.
If I understand correctly we are using the same cache for all repositories. While this may be OK now we may have problems when partitions are used:
I'm not really sure of the cache implementation thread safety- Recent objects from one repo may be evicted by others and make it run slower
- Concurrent use of the same cache may be slower (locking?)
I would have one cache per repo, a pool of caches that can be evicted or one cache per partition. I think we can have one per repo for now and get back to it when partitions are in use.
There was a problem hiding this comment.
the problem of having one cache per repo is that we will not able to know the amount of memory gitbase will use. Maybe we can use as a default value 96 MiB * number of repositories. Or apart from that, maybe we can implement an LRU with two key layers, to evict keys from a specific repository. WDYT?
There was a problem hiding this comment.
So maybe we can keep one cache in a pool but add mutex and prefix keys by repo id?
| // AddGitWithID adds a git repository to the pool. ID should be specified. | ||
| func (p *RepositoryPool) AddGitWithID(id, path string) error { | ||
| return p.Add(gitRepo(id, path)) | ||
| return p.Add(gitRepo(id, path, p.cache)) |
There was a problem hiding this comment.
I wouldn't store the cache in the git/sivaRepo. If they are different instances (one per repo) it can use lots of memory.
Signed-off-by: kuba-- <kuba@sourced.tech>
Signed-off-by: kuba-- <kuba@sourced.tech>
|
@jfontan - rebased. Lets go with the simplest approach, so far. |
jfontan
left a comment
There was a problem hiding this comment.
Let's go with this approach and iterate over it. I agree.
This PR closes #440
Changes:
go-gitrepositoryinterface requiresCache() cache.ObjectRepositoryPool