METRON-2073: Create in-memory use case for enrichment with map type and flatfile summarizer #1399

merrimanr · 2019-05-03T19:32:52Z

Contributor Comments

This PR adds a Stellar function for performing in-memory enrichments. Similar to ENRICHMENT_GET, which performs lookups in HBase, this function performs lookups from an in-memory map that is loaded from HDFS. The nosql_table, enrichment_type, and column_family parameters in ENRICHMENT_GET are similar to the ENRICHMENT_IN_MEMORY_GET path parameter in that they uniquely identify the data source. The indicator parameter is the same (the key used to lookup an enrichment). The new function is very similar to the OBJECT_GET function (they use the same cache abstraction) except that it performs the lookup, offers configuration specifically for in-memory enrichments, and potentially more. The in-memory objects are lazy-loaded, meaning they are loaded the first time an ENRICHMENT_IN_MEMORY_GET is executed.

The Jira states the first pass should be a 1-time load but it wasn't difficult to expose cache expiration settings. This allows cached enrichments to be refreshed at an interval and will support streaming enrichments.

Cache settings are applied in the following order:

Hardcoded defaults defined in constants
Top-level global config settings (used by the OBJECT_GET function)
Global config settings specific for ENRICHMENT_IN_MEMORY_GET

These settings are passed into the Guava CacheBuilder when the function is initialized.

Changes Included

An ObjectCache abstraction was created from the code in the Stellar OBJECT_GET function. This abstraction is shared by both the OBJECT_GET and ENRICHMENT_IN_MEMORY_GET functions. The current settings used in OBJECT_GET should be backwards compatible.
Dedicated cache settings for ENRICHMENT_IN_MEMORY_GET can be configured in the global config with the in.memory.enrichment.settings key.
Unit tests were added for new code and expanded for existing code.
Logging was added in various places to make it easier to determine when entries are being flushed and for what reason.

Testing Instructions

This has been tested in full dev with the Stellar CLI.

Spin up full dev and ensure data is flowing with no errors.
Create a enrichments.csv file with enrichment data:

key,value

Create a enrichments_updated.csv file with updated enrichment data:

key,updated value

Create a summarizer.json file that will serialize this data as a map in HDFS:

{
  "config" : {
    "columns" : {
      "key" : 0,
      "value" : 1
    },
    "state_init" : "{}",
    "state_update" : {
      "state" : "MAP_PUT(key, value, state)"
    },
    "separator" : ","
  },
  "extractor" : "CSV"
}

Use the flatfile_summarizer.sh script to create serialized files:

/usr/metron/0.7.1/bin/flatfile_summarizer.sh -i ./enrichments.csv -o ./enrichments.ser -e summarizer.json -p 1
/usr/metron/0.7.1/bin/flatfile_summarizer.sh -i ./enrichments_updated.csv -o ./enrichments_updated.ser -e summarizer.json -p 1

Put the first file in HDFS:

hdfs dfs -put enrichments.ser /tmp/enrichments.ser

Start the Stellar shell and define a field we can use for the enrichments:

[root@node1 tmp]# /usr/metron/0.7.1/bin/stellar --zookeeper node1:2181
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/metron/0.7.1/lib/metron-profiler-repl-0.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.5.1050-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Stellar, Go!
Functions are loading lazily in the background and will be unavailable until loaded fully.
{es.clustername=metron, es.ip=node1:9200, es.date.format=yyyy.MM.dd.HH, parser.error.topic=indexing, update.hbase.table=metron_update, update.hbase.cf=t, es.client.settings={}, profiler.client.period.duration=15, profiler.client.period.duration.units=MINUTES, enrichment.list.hbase.provider.impl=org.apache.metron.hbase.HTableProvider, enrichment.list.hbase.table=enrichment_list, enrichment.list.hbase.cf=t, user.settings.hbase.table=user_settings, user.settings.hbase.cf=cf, bootstrap.servers=node1:6667, source.type.field=source:type, threat.triage.score.field=threat:triage:score, enrichment.writer.batchSize=15, enrichment.writer.batchTimeout=0, profiler.writer.batchSize=15, profiler.writer.batchTimeout=0, geo.hdfs.file=/apps/metron/geo/default/GeoLite2-City.tar.gz, asn.hdfs.file=/apps/metron/asn/default/GeoLite2-ASN.tar.gz, object.cache.expiration.minutes=1, in.memory.enrichment.settings={cache.expiration=15, time.unit=SECONDS}}
[Stellar]>>> field := 'key'
key

Perform an in-memory enrichment. There should be a slight delay (for the initial load) but subsequent calls should return immediately:

[Stellar]>>> ENRICHMENT_IN_MEMORY_GET('/tmp/enrichments.ser', field)
value

The default cache expiration is 24 hours. Exit the Stellar CLI and set the object cache expiration to 1 minutes in the global config:

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
  ...
  "object.cache.expiration.minutes": 1
}' 'http://user:password@node1:8082/api/v1/global/config'

Note the object.cache.expiration.minutes setting is maintained for backwards compatibility. You can also set the expiration to 1 minute with these settings:

{
  "object.cache.expiration": 1,
  "object.cache.time.unit": "MINUTES"
}

Start the Stellar CLI again and perform the enrichment again. You should get the same result:

[root@node1 tmp]# /usr/metron/0.7.1/bin/stellar --zookeeper node1:2181
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/metron/0.7.1/lib/metron-profiler-repl-0.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.5.1050-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Stellar, Go!
Functions are loading lazily in the background and will be unavailable until loaded fully.
{es.clustername=metron, es.ip=node1:9200, es.date.format=yyyy.MM.dd.HH, parser.error.topic=indexing, update.hbase.table=metron_update, update.hbase.cf=t, es.client.settings={}, profiler.client.period.duration=15, profiler.client.period.duration.units=MINUTES, enrichment.list.hbase.provider.impl=org.apache.metron.hbase.HTableProvider, enrichment.list.hbase.table=enrichment_list, enrichment.list.hbase.cf=t, user.settings.hbase.table=user_settings, user.settings.hbase.cf=cf, bootstrap.servers=node1:6667, source.type.field=source:type, threat.triage.score.field=threat:triage:score, enrichment.writer.batchSize=15, enrichment.writer.batchTimeout=0, profiler.writer.batchSize=15, profiler.writer.batchTimeout=0, geo.hdfs.file=/apps/metron/geo/default/GeoLite2-City.tar.gz, asn.hdfs.file=/apps/metron/asn/default/GeoLite2-ASN.tar.gz, object.cache.expiration.minutes=1, in.memory.enrichment.settings={cache.expiration=15, time.unit=SECONDS}}
[Stellar]>>> field := 'key'
key
[Stellar]>>> ENRICHMENT_IN_MEMORY_GET('/tmp/enrichments.ser', field)
value

In a separate window, upload the updated enrichment file to HDFS:

hdfs dfs -put -f /tmp/enrichments_updated.ser /tmp/enrichments.ser

After a minute the value should change to the value in enrichments_updated.ser:

[Stellar]>>> ENRICHMENT_IN_MEMORY_GET('/tmp/enrichments.ser', field)
updated value

The cache expiration can also be set specifically for ENRICHMENT_IN_MEMORY_GET. Put the previous enrichments.ser back in HDFS and change the cache settings to 15 seconds:

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
  ...
  "object.cache.expiration.minutes": 1,
  "in.memory.enrichment.settings": {
    "cache.expiration": 15,
    "time.unit": "SECONDS"
  }
}' 'http://user:password@node1:8082/api/v1/global/config'

Perform steps 9 - 11. Now the value should be updated after 15 seconds.

Next Steps

This is intended to be a first pass and there are still some outstanding items. I am planning on adding javadocs and a section in our READMEs describing how to do in-memory enrichments.

Are there other features people would like to see added here? I think it could be useful to set cache settings for each object in HDFS (currently they are global to all objects and can only be changed on initialization). This would make things more complex and require some design work.

Does anyone have suggestions for more appropriate function and setting names?

What else would you change about this?

Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.
Please refer to our Development Guidelines for the complete guide to follow for contributions.
Please refer also to our Build Verification Guidelines for complete smoke testing guides.

In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:

For all changes:

Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?

For code changes:

Have you included steps to reproduce the behavior or problem that is being changed or addressed?
Have you included steps or a guide to how the change may be verified and tested manually?
Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
```
mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
```
Have you written or updated unit tests and or integration tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html:
```
cd site-book
mvn site
```

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommended that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.

simonellistonball · 2019-05-10T15:35:18Z

Would it be worth changing the caching to use caffeine, as we have in other locations, given the performance gains we've seen for larger caches with caffeine?

simonellistonball · 2019-05-10T15:39:13Z

On naming: since this is very much based on, and similar to OBJECT_GET, how about ENRICHMENT_OBJECT_GET?

simonellistonball · 2019-05-10T15:41:10Z

The other thing worth thinking about is some guard-rails around cache size. If we load a large file, it seems like there is a good chance of blowing up the enrichment topology somewhere hard to debug with a OOM. That's probably something that needs pushing down to OBJECT_GET too to be honest.

merrimanr · 2019-05-13T20:48:02Z

The latest commits should address your feedback @simonellistonball. Have you given any thought to being able to configure different cache settings for different paths/objects?

mmiklavc

@merrimanr This is a nice addition to our Stellar features! I really like the mem/size check addition.

I have a number of comments around test setup that I think are worth working through along with some requests for configuration documentation. One additional ask would be to add some integration tests. See the following for examples:

mmiklavc · 2019-05-21T23:16:08Z

...nrichment-common/src/main/java/org/apache/metron/enrichment/stellar/EnrichmentObjectGet.java

+
+@Stellar(namespace="ENRICHMENT"
+        ,name="OBJECT_GET"
+        ,description="Retrieve and deserialize a serialized object from HDFS and stores it in the ObjectCache,  " +


Can you add references to the global config options like we do for OBJECT_GET?

metron/metron-platform/metron-enrichment/metron-enrichment-common/src/main/java/org/apache/metron/enrichment/stellar/ObjectGet.java

Line 55 in 14efe83

"\"" + ObjectGet.OBJECT_CACHE_SIZE_KEY + "\" (default " + ObjectGet.OBJECT_CACHE_SIZE_DEFAULT + ")," +

Done in latest commit.

mmiklavc · 2019-05-22T17:22:48Z

...tron-enrichment-common/src/test/java/org/apache/metron/enrichment/stellar/ObjectGetTest.java

-    }
+  public void setup() throws Exception {
+    objectGet = new ObjectGet();
+    objectCache = mock(ObjectCache.class);


Any reason not to just use the real ObjectCache here? I get that it might arguably be an integration test at that point, but it's also a pretty trivial dependency. I've been thinking about this a while now, and I think we have quite a few instances where we're doing dependency injection and mocking where just using the real objects would probably be 1) simpler 2) clearer and 3) a more accurate, less error-prone test. This test is simple enough, though there are many other where the test setup for the mocks is anything but obvious. For example, this is from a recent PR of mine - https://github.com/apache/metron/pull/1409/files#diff-ab0ebf385d42a0fe97232a3e1131936bR338. More often than not, I find that spies (parserRunner -> https://github.com/apache/metron/pull/1409/files#diff-ab0ebf385d42a0fe97232a3e1131936bR222) are a symptom/code smell that warrants some further thought on what a class is doing and whether we should be pushing some of its responsibility to other collaborating classes.

See some expanded thoughts on mocking from Uncle Bob Martin's here - https://blog.cleancoder.com/uncle-bob/2014/05/10/WhenToMock.html

Per another of my comments, it looks like the main reason to mock here is the underlying Loader. I'd consider injecting a simple faux Loader if you don't want to add that complexity. I think there's value in testing the true integration between the Stellar function ENRICHMENT_OBJECT_GET wrapper and the underlying ObjectCache implementation versus just mocking the functionality of ObjectCache.

You make some good points on the tradeoffs of using mocks vs using real classes. In this case though, I don't agree that using the real ObjectCache class would be simpler.

Using a mock object for the cache is pretty simple here. The mock is set up to return values from method calls and verify methods were called correctly. This is done with only a couple lines of code.

If we were to use the real ObjectCache, now we have to:

Create an arbitrary cache configuration

Create a test Loader and inject it instead of the default Loader

Setup and instantiate the cache

This is more complex in my opinion. Plus any change to the ObjectCache setup requirements would also require an update to this test. I've experienced pain in the past where a change to a low level class required significant changes to several different tests because the tests were relying on the real implementation rather than being isolated to the class it's testing.

Again, I take your point that we should be cautious about over-using mocks. But I don't think it applies here.

I think the issue is that you never actually test the integration between those classes. It's only ever a mock interaction from the tests you've provided.

I am planning on adding the integration tests you requested. Would that cover it?

I think that makes sense, thanks @merrimanr.

Can you make sure we're setting sensible defaults for the cache as well? These two items surprise me that they would matter for this test:

Create an arbitrary cache configuration

Setup and instantiate the cache

It also wasn't clear to me that I'd have to do those things, even after spending a lot of time going over the code. I would have thought that I can just do new ObjectCache(simpleLoader), where simpleLoader is a basic stunt double reading from a string/bytearray or something, and have config defaults that just work. The unit test for ObjectCache should be testing all the fine-grained details, which are less important here. Likewise, the integration tests can just verify that the config is being picked up and passed through - no need to test all possible combos from the integration test standpoint.

Now that we're setting defaults in the ObjectCacheConfig constructor, all of this is much simpler. There is no need to populate cache settings as the defaults will work in most cases. Now the Create an arbitrary cache configuration step can be done by just creating a new ObjectCacheConfig object. You would still have to initialize the cache by calling ObjectCache.initialize() and provide a custom test Loader though.

mmiklavc · 2019-05-22T17:33:04Z

...tron-enrichment-common/src/test/java/org/apache/metron/enrichment/cache/ObjectCacheTest.java

+  }
+
+  @Test
+  public void testMultithreaded() throws Exception {


We should probably have some sort of timeout that will kill this if there is a multithreading issue. Maybe use an ExecutorService here instead?

Let me look into this further. This test was preexisting and I'm not sure what the original intention was.

I don't think this is necessarily a blocker.

mmiklavc · 2019-05-22T18:50:33Z

...t/metron-enrichment-common/src/main/java/org/apache/metron/enrichment/cache/ObjectCache.java

+
+  protected LoadingCache<String, Object> cache;
+  private static ReadWriteLock lock = new ReentrantReadWriteLock();
+  protected LoadingCache<String, Object> getCache() {


Why are we exposing the underlying cache?

This is only for testing purposes. I like your other suggestion of providing specific methods for things we need to test so this will go away when I make those changes.

mmiklavc · 2019-05-22T19:09:27Z

...tron-enrichment-common/src/test/java/org/apache/metron/enrichment/cache/ObjectCacheTest.java

+  @Test
+  public void test() throws Exception {
+    String filename = "target/ogt/test.ser";
+    Assert.assertTrue(cache.getCache() == null || !cache.getCache().asMap().containsKey(filename));


I realize some of this was probably around before, but these are great opportunities to make incremental improvements to the code base with a minimal impact/risk. I think a better option than exposing class internals - see Indecent Exposure - would be to offer methods that provide the functionality you're adding in the test. e.g.

ObjectCache cache = new ObjectCache(); Assert.assertTrue(cache.isEmpty()); Assert.assertTrue(cache.size() == 0); Assert.assertFalse(cache.hasKey(filename));

It's both clearer and doesn't require you to expose the internals of your class under test. I get the compulsion to do this - you might be thinking "well, how do I know it's null?" I couldn't find the original text from Kent Beck's book on TDD online that I'm thinking of, but the red, green, refactor cycle really covers this pretty well - part of the refactor phase is to swap out a static implementation, e.g. "return false" and put the real thing there. The test should still pass when you refactor. Not that you're doing TDD here and I'm not suggesting that you do, rather my point is there's precedent for not having to expose class internals for tests. Here's some additional background on it - https://blog.cleancoder.com/uncle-bob/2014/12/17/TheCyclesOfTDD.html.

Also regarding class internals exposure - https://martinfowler.com/bliki/TellDontAsk.html

Tell-Don't-Ask is a principle that helps people remember that object-orientation is about bundling data with the functions that operate on that data. It reminds us that rather than asking an object for data and acting on that data, we should instead tell an object what to do.

This approach also helps avoid "train wrecks" and violating the Law of Demeter as seen in !cache.getCache().asMap().containsKey(filename)

You are correct, this was preexisting. Happy to make your suggested changes.

Done with the latest commit.

mmiklavc · 2019-05-22T19:28:45Z

...nrichment-common/src/main/java/org/apache/metron/enrichment/stellar/EnrichmentObjectGet.java

+    Object value;
+    try {
+      Map cachedMap = (Map) objectCache.get(path);
+      LOG.debug("Looking up value from object at path '{}' using indicator {}", path, indicator);


Thanks for using the {} pattern, thumbs up.

mmiklavc · 2019-05-22T19:38:34Z

...nrichment-common/src/main/java/org/apache/metron/enrichment/stellar/EnrichmentObjectGet.java

+  public void initialize(Context context) {
+    Map<String, Object> config = (Map<String, Object>) context.getCapability(Context.Capabilities.GLOBAL_CONFIG, false)
+            .orElse(new HashMap<>());
+    ObjectCacheConfig objectCacheConfig = ObjectCacheConfig.fromGlobalConfig(config);


Why do we have both this fromGlobalConfig(config) method and then all the same config setup duplicated again below? e.g. OBJECT_CACHE_EXPIRATION_KEY. I think what you put in fromGlobalConfig(config) is pretty clear, well-encapsulated, and versatile. Am I missing something here?

Can you also add README details around the new global config options and an example of how they're set (showing where/how they're nested in the global config)? e.g.

#global config { ... "cache.option.1" : "val1", "cache.option.2" : false, "cache.option.3" : 879, ... }

The config setup below overrides the properties that come from fromGlobalConfig(config). I decided to keep the keys that same, only difference is the EnrichmentObjectGet settings are nested inside the enrichment.object.get.settings property. As I was testing I found it easier to remember the settings if they were consistent across both. Also, we want to allow a mix of default global settings and specific EnrichmentObjectGet settings. That's why it looks kind of strange with all the if statements. Can you think of a cleaner way to do it?

The config setup below overrides the properties that come from fromGlobalConfig(config).

I must be missing something - you're passing config in order to get fromGlobalConfig to create an objectCacheConfig. But then you're again using config.get(someKey) to override what you've already extracted from config in the first place?

Here is how the process around cache settings works:

Create a new ObjectCacheConfig object. This config will be a mix of defaults and top-level object get global config settings, depending on what is set in the global config.

Retrieve the nested enrichment object get settings using the enrichment.object.get.settings key.

For each setting, using the setting in the nested object from the previous step if it is defined. Otherwise fall back to the setting from step 1.

Does that make sense?

Ah geeze, I see what's happening now. I had to go back over this a few times to grok what's going on. Ok, so I wasn't aware of or clear on what was previously in global config from the original OBJECT_GET work versus what was added in this new feature. Link provided for context. I get wanting to keep OBJECT_GET's original config around in global config for existing functionality. I'm a little foggy on why the in-memory function wouldn't just have its own config completely independent of the original - ie just cut the cable on the config overrides, even though you're sharing similar infrastructure for the object cache. The 2 different Stellar functions can instantiate the underlying cache object with their own config however they want. I'm all about code reuse, and I like what you've done with the cache refactoring. I also like providing users options with sensible defaults, as we've done many other places in the application. But in this case I'm unclear what the added value is for this extra bit of extra complexity with the config inheritance - can we just let the config for the 2 different functions work independently and get rid of the override?

Sure that makes sense, it will make it simpler. I will separate them.

Latest commit makes ENRICHMENT_OBJECT_GET and OBJECT_GET configs separate. Should be simpler and easier to read now.

mmiklavc · 2019-05-22T19:48:39Z

...tron-enrichment-common/src/test/java/org/apache/metron/enrichment/cache/ObjectCacheTest.java

+    try(BufferedOutputStream bos = new BufferedOutputStream(fs.create(new Path(filename), true))) {
+      IOUtils.write(SerDeUtils.toBytes(data), bos);
+    }
+    cache.initialize(ObjectCacheConfig.fromGlobalConfig(new HashMap<>()));


What is this actually doing? Why not just new ObjectCacheConfig()?

Initializing the cache requires some ObjectCacheConfig properties to be set, otherwise you get NPEs. I just did this for convenience. I'm happy to change it to new ObjectCacheConfig() and explicitly set the cache properties.

It sounds like you might mean to have the no-arg default constructor set to private ObjectCacheConfig() {} because using it causes unrecoverable errors. Alternatively, you could also do away with fromGlobalConfig and simply create

public ObjectCacheConfig(Map<String, Object> globalConfig) { ... }

but I don't think we should have both. I personally prefer the latter approach and leveraging constructors for their intended purpose. Let's let the API dictate the use rather than relying on javadoc or, worse yet, NPEs only found at runtime.

Side note - what specifically causes NPEs? Can we address that as well, while we're at it?

You're right. The no arg constructor doesn't really serve any purpose. I will change fromGlobalConfig to a constructor (should have done this in the first place).

The NPE comes from Caffeine code. As long as we're not passing in null values to the cache builder it's not an issue.

See my other recent comments on this - can we setup some sensible defaults for the config? I think the pattern we've attempted to follow is to maximize configurability of the system by exposing as many options as possible, but also making easy to get up and running with some defaults that they can configure PRN. I'd look at the cache used in ParallelEnricher to see what has been used for defaults in other caches.

I believe moving from fromGlobalConfig to a constructor will solve this since that method sets sensible defaults.

Done with the latest commit.

mmiklavc · 2019-05-22T19:57:04Z

...tron-enrichment-common/src/test/java/org/apache/metron/enrichment/cache/ObjectCacheTest.java

+
+  @Test
+  public void test() throws Exception {
+    String filename = "target/ogt/test.ser";


This approach leaves open a potential for dirty data. Here are a few options that eliminate this issue altogether are:

Our TestUtils functions. You'll need to tweak them for writing byte arrays, but there is temp dir functionality there as well - https://github.com/apache/metron/blob/master/metron-platform/metron-integration-test/src/main/java/org/apache/metron/integration/utils/TestUtils.java

JUnit's temp directory/file feature

metron/metron-platform/metron-common/src/test/java/org/apache/metron/common/utils/HDFSUtilsTest.java

Line 35 in 9cee51e

public TemporaryFolder tempDir = new TemporaryFolder();

Add a cleanup hook in @Before that wipes a pre-determined temporary directory, if it exists, e.g. target/ogt/. One benefit of this approach is that if a test fails, the test data is still in the temp dir and can be readily inspected.

This was also preexisting. You suggestions make sense and I will add them in.

Done with the latest commit.

mmiklavc · 2019-05-22T20:01:01Z

...t/metron-enrichment-common/src/main/java/org/apache/metron/enrichment/cache/ObjectCache.java

+            .removalListener((path, value, removalCause) -> {
+              LOG.debug("Object retrieved from path '{}' was removed with cause {}", path, removalCause);
+            })
+            .build(new Loader(new Configuration(), config));


If there's one place to provide an opportunity for dependency injection, I think the Loader implementation is it.

merrimanr · 2019-05-23T17:31:25Z

I believe I've addressed the feedback so far with the latest commit. Let me know how it looks now and what I'm missing.

mmiklavc · 2019-06-13T18:54:08Z

+1, nice work @merrimanr

initial commit

fcaf095

merrimanr added 3 commits May 13, 2019 10:48

switched to caffeine cache implementation

6869d07

added max file size setting

3b8d5f0

changed name

f1e47c6

mmiklavc suggested changes May 22, 2019

View reviewed changes

pr feedback

28ec844

missing license headers

6e20d53

merrimanr closed this May 24, 2019

merrimanr reopened this May 24, 2019

enrichment get config separate from object get config

a935ee7

mmiklavc approved these changes Jun 13, 2019

View reviewed changes

asfgit closed this in 38b8a78 Jun 17, 2019

mmiklavc mentioned this pull request Aug 22, 2019

METRON-2149: Shaded jar classifier is not consistent #1436

Closed

11 tasks

METRON-2073: Create in-memory use case for enrichment with map type and flatfile summarizer #1399

METRON-2073: Create in-memory use case for enrichment with map type and flatfile summarizer #1399

Uh oh!

Conversation

merrimanr commented May 3, 2019

Contributor Comments

Changes Included

Testing Instructions

Next Steps

Pull Request Checklist

For all changes:

For code changes:

For documentation related changes:

Note:

Uh oh!

simonellistonball commented May 10, 2019

Uh oh!

simonellistonball commented May 10, 2019

Uh oh!

simonellistonball commented May 10, 2019

Uh oh!

merrimanr commented May 13, 2019

Uh oh!

mmiklavc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!