Open
Conversation
9e40eac to
634711a
Compare
289dd19 to
2a17e8c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This introduces the framework for supporting S3 event notifications. S3 event notifications will invoke the Lambda function when an AWS service writes new logs to an S3 bucket (the S3 creation event). While this PR mostly lays the framework for future support of S3 log based AWS services, it currently supports CloudTrail logs from S3 (in addition to the existing Cloudtrails CloudWatch support).
A single S3 notification may contain multiple updates from the creation of multiple S3 objects, each which needs to be read, parsed and converted. By default this will process five S3 objects concurrently and will emit logs in batches up to 1k to the logs pipeline. This may be mean that ordering of the S3 objects, and logs, are not maintained. S3 objects listed later in the event notification may be loaded, parsed and exported before earlier S3 objects. In theory this can be fixed by setting
FORWARDER_S3_MAX_PARALLEL_OBJECTS=1. However, S3 event notifications may be fired to multiple Lambda methods concurrently, so ordering is not guaranteed on prinicple.There's some refactoring that I plan to do later that will cleanup sharing between the Cloudwatch and S3 logs support. In addition, the CW logs support should be broken out of the parse module, similar to this new s3logs support.
The PR required some changes to the acker component because we won't know how many acks exist ahead of time. Instead of buffering all acks, we spawn a listener to consume the acks/nacks concurrent to processing the S3 objects.
Fixes: #5