Skip to content
This repository was archived by the owner on Sep 17, 2025. It is now read-only.

Skeleton for Azure Log Exporter#657

Merged
reyang merged 20 commits intomasterfrom
azure
May 24, 2019
Merged

Skeleton for Azure Log Exporter#657
reyang merged 20 commits intomasterfrom
azure

Conversation

@reyang
Copy link
Copy Markdown
Contributor

@reyang reyang commented May 20, 2019

Part of the #616 effort, using Azure log exporter for the pilot work.

I wish to get early feedback on the design and usage. Currently I haven't added the actual export logic, all the exporter does is to print logs to the STDOUT.

We can explore if it makes sense to move the following classes to opencensus core library:

  • opencensus.ext.azure.log_exporter.BaseLogHandler
  • opencensus.ext.azure.log_exporter.Worker

The design principle:

  1. Align with Python logging practice, avoid introducing another pattern.
  2. The export logic should run asynchronously, instead blocking the hot path.
  3. Try to align with the queue/worker direction for trace exporter.

This is what I expect customers to use:

import logging

from opencensus.ext.azure.log_exporter import AzureLogHandler

logger = logging.getLogger(__name__)
logger.addHandler(AzureLogHandler())
logger.warning('Hello, World!')

A more complex scenario with multiple handlers:

import logging

from opencensus.ext.azure.log_exporter import AzureLogHandler

logger = logging.getLogger(__name__)

# create azure log handler
oclh = AzureLogHandler()
oclh.setFormatter(logging.Formatter('%(message)s (%(pathname)s:L%(lineno)s)'))
logger.addHandler(oclh)

# create console log handler
ch = logging.StreamHandler()
ch.setFormatter(logging.Formatter('%(asctime)s %(message)s'))
logger.addHandler(ch)

# create stackdriver log handler
# sdlh = StackdriverLogHandler()
# logger.addHandler(sdlh)

logger.warning('Hello, World!')

@c24t @lzchen

@reyang reyang requested review from a team, c24t and songy23 as code owners May 20, 2019 21:06
@reyang reyang added azure Microsoft Azure logging labels May 20, 2019
Copy link
Copy Markdown
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing the exporter as a log handler and having the exporter accept LogRecords seem like good choices to me.

This looks good as WIP, but I think there are still API changes to make (remove emit, consider losing the shared event, etc.) before moving other classes out of the contrib package. We should also consider adding logging (vs. just log correlation) to the spec if we're going to expose a general purpose log exporter API.

for item in batch:
trace_id = getattr(item, 'traceId', 'N/A')
span_id = getattr(item, 'spanId', 'N/A')
print('{levelname} {trace_id} {span_id} {pathname}:L{lineno} {msg}'.format(trace_id=trace_id, span_id=span_id, **vars(item)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a placeholder, but to make sure we're on the same page page -- I'd expect users to attach another handler to the same logger if they wanted to print the logs here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we're on the same page.
We don't expect the user to go through OpenCensus in order to print/format logs, users will just have to follow standard Python approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here goes the proposal:

import logging

from opencensus.ext.azure.log_exporter import AzureLogHandler

logger = logging.getLogger(__name__)

# create azure log handler
oclh = AzureLogHandler()
oclh.setFormatter(logging.Formatter('%(message)s (%(pathname)s:L%(lineno)s)'))
logger.addHandler(oclh)

# create console log handler
ch = logging.StreamHandler()
ch.setFormatter(logging.Formatter('%(asctime)s %(message)s'))
logger.addHandler(ch)

# create stackdriver log handler
# sdlh = StackdriverLogHandler()
# logger.addHandler(sdlh)

logger.warning('Hello, World!')

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplifying this LGTM, but later we may still want to
bas>

  • revisit the worker/queue classes as per #642 (comment) and related comments

Yep, I'll merge this PR as is. And work on the next PR to have the actual Azure log exporter logic (currently it is just console print) today. Starting from next week, I'll have time to revisit/refactor worker/queue out.

  • consider separating the exporter from the log handler when we e.g. send logs to the agent

We can explore the following options:

  1. Align across OC/OT SDKs - having an explicit exporter concept across all languages.
  2. Align more with language convention (e.g. log appender for Java, log handler for Python), and provide base/helper class to support configuration/policy.

@reyang
Copy link
Copy Markdown
Contributor Author

reyang commented May 23, 2019

Exposing the exporter as a log handler and having the exporter accept LogRecords seem like good choices to me.

After a bit more exploration, I think it makes more sense to combine the handler and exporter logic, this aligns better with Python practice (e.g. having formatter associated with handler). Each handler will have its own queue.

This looks good as WIP, but I think there are still API changes to make (remove emit, consider losing the shared event, etc.) before moving other classes out of the contrib package.

Yes, I agree. This is the direction we're moving towards.

We should also consider adding logging (vs. just log correlation) to the spec if we're going to expose a general purpose log exporter API.

After spending more time playing around logging, I think it makes more sense to align logging APIs with language/runtime instead of trying to align across SDKs, thoughts? @c24t

Copy link
Copy Markdown
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplifying this LGTM, but later we may still want to

  • revisit the worker/queue classes as per #642 (comment) and related comments
  • consider separating the exporter from the log handler when we e.g. send logs to the agent


def __init__(self, src, dst):
self._src = src
self._dst = dst
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the circular reference here, we've been bitten by this before trying to clean up exporters. See

def new_stats_exporter(options=None, interval=None):
"""Get a stats exporter and running transport thread.
Create a new `StackdriverStatsExporter` with the given options and start
periodically exporting stats to stackdriver in the background.
Fall back to default auth if `options` is null. This will raise
`google.auth.exceptions.DefaultCredentialsError` if default credentials
aren't configured.
See `opencensus.metrics.transport.get_exporter_thread` for details on the
transport thread.
:type options: :class:`Options`
:param exporter: Options to pass to the exporter
:type interval: int or float
:param interval: Seconds between export calls.
:rtype: :class:`StackdriverStatsExporter`
:return: The newly-created exporter.
"""
if options is None:
_, project_id = google.auth.default()
options = Options(project_id=project_id)
if str(options.project_id).strip() == "":
raise ValueError(ERROR_BLANK_PROJECT_ID)
ci = client_info.ClientInfo(client_library_version=get_user_agent_slug())
client = monitoring_v3.MetricServiceClient(client_info=ci)
exporter = StackdriverStatsExporter(client=client, options=options)
transport.get_exporter_thread(stats.stats, exporter, interval=interval)
return exporter
.

Copy link
Copy Markdown
Contributor Author

@reyang reyang May 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original understanding is that we have Handler.close which can be used explicitly or implicitly (by Python logging library), and the circular reference will be collected by the GC. This shouldn't cause problem? Is this to prevent memory leak, or to reduce GC overhead?

Regarding memory leak, I run the following app for an hour and see a flat memory usage:

import logging

from opencensus.ext.azure.log_exporter import AzureLogHandler

logger = logging.getLogger(__name__)
while True:
    handler = AzureLogHandler()
    logger.addHandler(handler)
    logger.warning('Hello, World!')
    logger.removeHandler(handler)
    handler.close()

For GC overhead, given the circle is pretty small, I guess there is no noticeable difference?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the circular reference will be collected by the GC. This shouldn't cause problem?

I thought there were more general problems cleaning up circular references in python, but it may only be a problem for classes that define __del__ (see https://stackoverflow.com/a/2428888), in which case this is fine.

@reyang reyang merged commit fe82b7e into master May 24, 2019
@reyang reyang deleted the azure branch May 24, 2019 20:15
@reyang reyang mentioned this pull request May 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants