Add support for JSON formatted logs

This adds a new extra package dependency on `json-logging`
and an environment variable, which when set to "true" regardless
of case, will try to use the json-logging non-web app log formatter.

If the `json-logging` package is not installed but the environment
variable is True, something like this will be logged but it will not
crash the application:

```
$ ENABLE_JSON_LOGGING=true jupyter notebook
Unable to use json logging due to missing packages. Run "pip install notebook[json-logging]" to fix.
Traceback (most recent call last):
  File "/home/osboxes/jupyter/notebook/notebook/notebookapp.py", line 144, in <module>
    import json_logging
ModuleNotFoundError: No module named 'json_logging'
```

Initially I tried to add a new Bool config option to toggle this but
the problem is (from my limited understanding of traitlets and tornado)
is that `_log_formatter_cls` needs to be set early and trying to se the
log formatter later in `init_logging` is too late - or at least I couldn't
figure out a way to reset the log formatter dynamically (I tried calling
`_log_format_changed` [1] but that didn't work).

With this working you get logs like this:

```
{"written_at": "2020-10-07T21:10:51.606Z", "written_ts": 1602105051606265000, "msg": "404 GET /nbextensions/widgets/notebook/js/extension.js (127.0.0.1) 9.26ms referer=http://localhost:8888/notebooks/Untitled.ipynb", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "WARNING", "module": "log", "line_no": 49}
{"written_at": "2020-10-07T21:10:54.443Z", "written_ts": 1602105054443309000, "msg": "Starting buffering for f260ddd7-938c-42d0-ac3b-455bea76694f:49f30b53fc4b4ec6a8f2fb748a171613", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "INFO", "module": "kernelmanager", "line_no": 222}
{"written_at": "2020-10-07T21:10:54.446Z", "written_ts": 1602105054446264000, "msg": "Kernel shutdown: f260ddd7-938c-42d0-ac3b-455bea76694f", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "INFO", "module": "multikernelmanager", "line_no": 201}
```

An obvious improvement here would to be able to split the `msg` fields apart
so that we can log things like response status code, request method, request
URL, response_time_ms etc. That should be possible with the `JSONLogWebFormatter`
from `json-logging` but when I tried using that I was getting errors from
the library about a correlation id (which is based on a request header we
don't use). The `json-logging` library supports several web frameworks like
Flask but unfortunately does not have built in support for Tornado, but it does
support custom formatters so that might be a follow up option to improve on this.

[1] https://github.com/ipython/traitlets/blob/4.3.3/traitlets/config/application.py#L195

Closes #5798
This commit is contained in:
Matt Riedemann 2020-10-07 15:17:22 -05:00
parent 2ba296039a
commit 1929fb53b8
2 changed files with 17 additions and 1 deletions

View File

@ -138,6 +138,19 @@ try:
except ImportError:
terminado_available = False
# Tolerate missing json_logging package.
enable_json_logs = os.getenv('ENABLE_JSON_LOGGING', 'false').lower() == 'true'
try:
import json_logging
except ImportError:
# If configured for json logs and we can't do it, log a hint.
if enable_json_logs:
logging.getLogger(__name__).exception(
'Unable to use json logging due to missing packages. '
'Run "pip install notebook[json-logging]" to fix.'
)
enable_json_logs = False
#-----------------------------------------------------------------------------
# Module globals
#-----------------------------------------------------------------------------
@ -702,7 +715,9 @@ class NotebookApp(JupyterApp):
password=(NotebookPasswordApp, NotebookPasswordApp.description.splitlines()[0]),
)
_log_formatter_cls = LogFormatter
# Unless there is a way to re-initialize the log formatter (like with _log_format_changed?)
# we need to load the json logging formatter early here otherwise traitlets complains.
_log_formatter_cls = json_logging.JSONLogFormatter if enable_json_logs else LogFormatter
@default('log_level')
def _default_log_level(self):

View File

@ -120,6 +120,7 @@ for more information.
'nbval', 'nose-exclude', 'selenium', 'pytest', 'pytest-cov'],
'docs': ['sphinx', 'nbsphinx', 'sphinxcontrib_github_alt', 'sphinx_rtd_theme'],
'test:sys_platform != "win32"': ['requests-unixsocket'],
'json-logging': ['json-logging']
},
python_requires = '>=3.5',
entry_points = {