-
Notifications
You must be signed in to change notification settings - Fork 506
METRON-1879 Allow Elasticsearch to Auto-Generate the Document ID #1269
Conversation
…nsformer Shade plugin transformer.
…ate properties for new client.
This reverts commit 7d9ee25.
…h, and kerberos/security documentation
ce197b2 to
9354c1e
Compare
| Document toPatch = retrieveLatestDao.getLatest(guid, sensorType); | ||
| if(toPatch != null && toPatch.getDocument() != null) { | ||
| originalSource = toPatch.getDocument(); | ||
| documentID = toPatch.getDocumentID().orElse(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to set the doc ID when patching an already indexed document.
|
|
||
| } catch (Throwable e) { | ||
| container = new DocumentContainer(e); | ||
| LOG.error("Unable to find latest document; indexDao={}, error={}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had no logging to indicate when a particular IndexDao fails. I ran into this and it took some effort to trace since we lacked logging.
|
|
||
| } catch (Throwable e) { | ||
| container = new DocumentContainer(e); | ||
| LOG.error("Unable to remove comment from alert; indexDao={}, error={}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had no logging to indicate when a particular IndexDao fails. I ran into this and it took some effort to trace since we lacked logging.
|
|
||
| } catch (Throwable e) { | ||
| container = new DocumentContainer(e); | ||
| LOG.error("Unable to add comment to alert; indexDao={}, error={}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had no logging to indicate when a particular IndexDao fails. I ran into this and it took some effort to trace since we lacked logging.
| query = idsQuery.addIds(guid); | ||
| // should match any of the guids | ||
| // the 'guid' field must be of type 'keyword' or this term query will not match | ||
| BoolQueryBuilder guidQuery = boolQuery().must(termsQuery(Constants.GUID, guids)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can no longer use an id query, since the doc ID is no longer the same as the GUID.
| private Document toDocument(SearchResult result, Long timestamp) { | ||
| Document document = Document.fromJSON(result.getSource()); | ||
| document.setTimestamp(timestamp); | ||
| document.setDocumentID(result.getId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to set the document ID when retrieving the document.
|
|
||
| addFilter(field: string, value: string) { | ||
| field = (field === 'id') ? 'guid' : field; | ||
| field = (field === 'id') ? '_id' : field; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allows a user to filter on the document ID, if they choose to add it to their view/table.
| onSort(sortEvent: SortEvent) { | ||
| let sortOrder = (sortEvent.sortOrder === Sort.ASC ? 'asc' : 'desc'); | ||
| let sortBy = sortEvent.sortBy === 'id' ? 'guid' : sortEvent.sortBy; | ||
| let sortBy = sortEvent.sortBy === 'id' ? '_uid' : sortEvent.sortBy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allows a user to sort by the document ID, if they choose to add it to their view/table.
| </td> | ||
| <td [attr.colspan]="alertsColumnsToDisplay.length - 1"> | ||
| <a (click)="addFilter('guid', alert.id)" [attr.title]="alert.id" style="color:#689AA9"> {{ alert.source['name'] ? alert.source['name'] : alert.id | centerEllipses:20:cell }}</a> | ||
| <a (click)="addFilter('guid', alert.source['guid'])" [attr.title]="alert.source['guid']" style="color:#689AA9"> {{ alert.source['name'] ? alert.source['name'] : alert.source['guid'] | centerEllipses:20:cell }}</a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Displays the GUID, if a user has not renamed a meta alert.
|
This is probably relatively minor, but we seem to map between the underlying "_id" and "id" on the UI. However, in the search box, I can only do "_id:" instead of "id:blah", probably because we pass that stuff directly. Making that work in the general case (given that there could be other modifiers), is probably not really worth it, given that it's an id field. However, are we okay with the user having to know that to find a particular id in the alerts UI they need to do "_id"? It's particularly confusing, because the results of a findOne don't include "id" at all (because it's an ES added field, not a Metron document field), and the results of search include "id", e.g. I'm a bit confused why the search endpoint returns with "id', when directly hitting ES returns "_id". I don't see anything here that would obviously cause it, but maybe I'm not looking in the right place. Returns |
|
I assume it's unrelated to this PR, but I did want to point out that renaming the metaalert doesn't actually take effect in the Alerts UI pane until something else happens (like a new search, or an escalation). Basically, it doesn't do the optimistic updating we do elsewhere, but instead only comes back on search |
The UI code has a few of places where it does special treatment for a field called "id" that I had to remap. One related thing I did ensure we handle is that if you add the "id" field to the table, then click on it, it will correctly populate the search bar so that it filters for that document ID. That is shown in the screenshot of step 5 in the testing instructions. From that screenshot, you see that it actually adds _id to the search bar. And that just confirms what you are seeing. Logically, it does make sense to me that a user should be able to type "id:XYZ" into the search bar and it filters by document ID, since the column is labelled "id". Either that or we should get rid of this special thing called "id" and represent it consistently in the UI as "_id". Either way it really should be one way or the other. I can look into this and see how much effort it would take.
What is your opinion on this? Do you think it is worth the effort? |
|
Also note that this inconsistency exists in master too. (1) The table has a column called "id" that when you click on it, it populates the search bar with "guid:fff6c243-dedc-4b46-b005-9a187dbb9d24". (2) If I manually type in "id:fff6c243-dedc-4b46-b005-9a187dbb9d24", the UI returns no results because it doesn't map ID correctly to the guid. That being said, the behavior in this PR is rather similar to master in this way. Or is that just me trying to make a case to be lazy and not address this issue? :) |
|
Was the intent for the id mapping to make our client searching doc index tech agnostic? It seems like a reasonable thing to want our client-facing querying interface to be shielded from changes to our underlying dependencies. |
|
@nickwallen are you able to search for "_id" or is that restricted by the UI due to special chars? I think we might want to encapsulate a formal concept of an "id" field that maps to whatever the underlying doc store is - solr/es/other, and it sounds like we already do this to some extent. But I might also expect that users could search on fields specific to the doc store that we don't explicitly handle, e.g. id maps to _id for ES, but maybe you could also search for "_id" explicitly, if desired. Again, this may cause problems with syntax parsing in the UI so I'm just thinking out loud. |
Yes, see the screenshot in step 5 of the test instructions. |
|
If it exists in master, I'm fine leaving it working that way here (maybe document it somewhere?). I'd like to see a follow-on Jira to clean this up though. It seems like we wanted some special casing, but it's not clear for what and it's obviously not a clean implementation |
|
+1, assuming @mmiklavc is fine with the "id". Ran it up and it worked really well. Thanks for the contribution! |
|
+1 by inspection. On the subject of the id vs _id vs _uid vs guid - I do think we need to clean this up at some point. _uid, which is _type/doctype + _id, is used for sorting bc you can't sort on _id in ES. In 6.x _uid is deprecated https://www.elastic.co/guide/en/elasticsearch/reference/6.0/mapping-uid-field.html. We've touched on this before, but we should consider a translation layer that manages variations between our optional downstream indexing engines. I'm not a big fan of this implementation bleed into the UI, but I recognize that this is bigger than just the id field. |


With this change, when documents are written to Elasticsearch the document ID is no longer set as the Metron GUID, but instead left unset so that Elasticsearch can auto-generate it. Doing this improves write performance into Elasticsearch.
This will also be the case for any Lucene based Indexer, including Solr. This work only covers Elasticsearch, but the same should be done for Solr as part of a separate effort.
This change is dependent on the following open pull requests.
/api/v1/update/replaceendpoint #1284Changes
The
ElasticsearchRetrieveLatestDaowas updated since the GUID is no longer the document ID. This instead does a terms query on the GUID field instead of an ID query.The
Documentclass now contains an optional documentID field. If theDocumentis retrieved from one of the DAOs this field will be populated. When creating a new document, this field will be empty.Many of the integrations tests had to be updated because the GUID and document ID are now different.
The Alert UI was updated so that it visually looks the same. By default, the Metron GUID is still shown as one of the first columns in the table.
This change is backwards compatible. The Alerts UI should continue to work no matter if some of the underlying indices were written with the Metron GUID as the document ID and some are written with the auto-generated document ID.
Testing
Spin-up Full Dev.
Open up the Alerts UI and perform the following basic actions.
Click on the configure wheel and add the 'id' field to the table view. This will now display both the GUID and document ID in the table. They of course will be different.
Click on the 'guid' field in any row to filter the search results by the guid.

Click on the 'id' field to filter the search results by the document ID.

Group by some fields to drill into the data. In the tree view, click on the 'guid' column and ensure the data sorts correctly. Do the same for the 'id' column that was added.

Perform the following actions on a meta-alert.
Pull Request Checklist