Skip to main content

Custom remote logging strategies [Resolved]

I am implementing an asynchronous remote (to aws S3) logging handler (in python, but it doesn't matter) supporting 4 modes:

  1. Immediate: the messages get written immediately to S3 - I am trying to avoid this mode
  2. By queue size: the messages get stored into a linked queue, and after reaching a certain size (say 500ko), the whole content is persisted all at once to a file in S3
  3. By delimiters: all the messages contained between 2 string delimiters, will be treated as 1 and sent all at once, e.g: here message \n message2 \n will be sent at once

    logger.info("#start#")
    logger.info("message")
    logger.info("message2#end#)
    
  4. By timer: every 10 seconds or so, the cumulated logs get sent to a file in S3.

The reason I decided to implement remote logging is that the cluster could crash at any time or get terminated after finishing its processing and I would therefore lose the locally stored logs, when it would be helpful to know "why" it crashed of course, but also and very importantly, "when" it crashed, especially for long running operations, in which case the processing could be resumed at that stage.

An idea would be to have a queue (with ActiveMQ or Kafka etc.) on which messages get published in real-time, then probably aggregated before going to S3, but I thought it would probably be overkill to drag a whole broker infrastructure for this use-case.


My implementation works, but my questions are more conceptual and best-practice oriented:

  • In case of using the modes "2" (by queue size) and "4" (by timer), how could I be notified of the end of the programs execution, so I flush the content of the local queue, and stop the timer thread? currently marking the logging thread as daemon but I obviously miss the last messages, looking now for a better way, to avoid daemons yet getting notified that the main thread finished, to delay the termination of these threads until everything gets pushed.

  • Does my approach make sense? or am I completely, and badly in a hacky way, reinventing the wheel?


Question Credit: Mehdi B.
Question Reference
Asked January 11, 2019
Posted Under: Programming
83 views
1 Answers

Why would the cluster crash at any time? Stopping after it did its work is a perfectly fine scenario, but this should also mean that it sent, if needed, the logs to a remote location. But crashing?...

Independently of that:

  • Cases where the logs should be kept locally are rare. Centralize your logs; it makes your life easier.

  • Don't reinvent the wheel: use an existent logging infrastructure when you can. For Linux, this would mean relying on centralized syslog with two or more logging servers for failover scenarios, with most potential problems already solved for you.

Amazon S3 is not really a good place for your logs anyway:

  • The first scenario would just be cost-prohibitive; you can't possibly pay to create an object in a bucket for every log entry, and if you have to read the objects later, you'll once again encounter a high cost.

  • The second scenario has the same problem as you're trying to solve with crashing clusters. If you're storing logs in a queue and the cluster crashes, you'll lose in average half of your queue size; with 500 KB queue, losing 250 KB means having no idea why the cluster crashed most of the time.

  • The third one has the combined problems of the first two: still a possibly high cost of S3 and loss of latest messages during a crash.

  • The fourth one is problematic for the same reason as the second one.

If you're unable to solve the crash problem and your only option is to rely on Amazon services (as opposed to your own logging infrastructure), consider:

  • Amazon ElasticSearch service that you can use (1) to store every message individually while paying much less than with S3, and (2) to query it easily or leverage the power of Kibana.

  • Amazon Kinesis Firehose, designed to store huge amounts of log data. Not sure if it applies to your situation.

  • Amazon RDS, to be used as an ordinary database. Maybe the most flexible, but also the one where you'll have to do most of the work.


credit: Arseni Mourzenko
Answered January 11, 2019
Your Answer
D:\Adnan\Candoerz\CandoProject\vQA