Introduction
In this chapter, we will explore several advanced features and usage examples of the Jupyter Notebook. As we have only seen basic features in the previous chapters, we will dive deeper into the architecture of the Notebook here.
The Notebook ecosystem
Jupyter notebooks are represented as JavaScript Object Notation (JSON) documents. JSON is a language-independent, text-based file format for representing structured documents. As such, notebooks can be processed by any programming language, and they can be converted to other formats such as Markdown, HTML, LaTeX/PDF, and others.
There is an ecosystem of tools around Jupyter Notebook. Notebooks are being used to create slides, teaching materials, blog posts, research papers, and even books. In fact, this very book is entirely written in the Notebook using the Markdown format and a custom-made Python tool.
JupyterLab is the next generation of the Jupyter Notebook. It is still in an early stage of development at the time of writing. We cover it in the last recipe of this chapter.
Architecture of the Jupyter Notebook
Jupyter implements a two-process model, with a kernel and a client. The client is the interface offering the user the ability to send code to the kernel. The kernel executes the code and returns the result to the client for display. In the Read-Evaluate-Print Loop (REPL) terminology, the kernel implements the Evaluate, whereas the client implements the Read and the Print of the process.
The client can be a Qt widget if we run the Qt console, or a browser if we run the Jupyter Notebook. In the Jupyter Notebook, the kernel receives entire cells at once, so it has no notion of a notebook. There is a strong decoupling between the linear document containing the notebook, and the underlying kernel.
All communication procedures between the different processes are implemented on top of the ZeroMQ (ZMQ) messaging protocol (http://zeromq.org). The Notebook communicates with the underlying kernel using WebSocket, a TCP-based protocol implemented in modern web browsers.
In a notebook, typing %connect_info
in a cell gives the information we need to connect a new client (such as a Qt console) to the underlying kernel:
>>> %connect_info { "shell_port": 58645, "iopub_port": 47422, "stdin_port": 60550, "control_port": 39092, "hb_port": 49409, "ip": "127.0.0.1", "key": "2298f955-7020b0ce534e7a8d81053d43", "transport": "tcp", "signature_scheme": "hmac-sha256", "kernel_name": "" } Paste the above JSON into a file, and connect with: $> jupyter <app> --existing <file> or, if you are local, you can connect with just: $> jupyter <app> --existing kernel-4342f625-a8... or even just: $> jupyter <app> --existing if this is the most recent Jupyter kernel you have started.
Here, <app>
is console, qtconsole
, or notebook
.
JupyterHub, available at https://jupyterhub.readthedocs.io/en/latest/, is a Python library that can be used to serve notebooks to a set of end-users, for example students of a particular class, or lab members in a research group. It handles user authentication and other low-level details.
Security in notebooks
It is possible for an attacker to put malicious code in a Jupyter notebook. Since notebooks may contain hidden JavaScript code in a cell output, it is theoretically possible for malicious code to execute surreptitiously when the user opens a notebook.
For this reason, Jupyter has a security model where HTML and JavaScript code in a notebook can be either trusted or untrusted. Outputs generated by the user are always trusted. However, outputs that were already there when the user first opened an existing notebook are untrusted.
The security model is based on a cryptographic signature present in every notebook. This signature is generated using a secret key owned by every user.
References
The following are some references about the Notebook architecture:
- Overview of IPython at http://ipython.readthedocs.io/en/stable/overview.html
- Documentation for the Jupyter Notebook, available at https://jupyter.readthedocs.io/en/latest/
- Security in the Notebook, described at http://jupyter-notebook.readthedocs.io/en/stable/security.html
- The Jupyter messaging protocol, at http://jupyter-client.readthedocs.io/en/latest/messaging.html
- Wrapper kernels at http://jupyter-client.readthedocs.io/en/latest/wrapperkernels.html
Here are a few kernels in non-Python languages for the Notebook:
- IJulia, available at https://github.com/JuliaLang/IJulia.jl
- IRkernel, available at https://github.com/IRkernel/IRkernel
- IHaskell, available at https://github.com/gibiansky/IHaskell
- Dozens of kernels are referenced at https://github.com/jupyter/jupyter/wiki/J