First try with Bazel
I have been knowing Bazel for a while, and it has been in my #ToLearn list for a while. Recently I gave it a try and this blog post summarize my learning and impression about Bazel.
For those who are new to Bazel: Bazel is a build tool releases by Google. You can quickly check all the benefits of Bazel on the website. There is also an impressive list of who are using Bazel. What brings me to Bazel is speed and consistency. I had read many articles about Bazel helped speed up the build by 2x-3x. Bazel runs builds in a sandboxed environment, which ensures predictable, environment-independent results. I’m also interested in using Bazel to build Docker, which is not like normal Docker build, Bazel will produce deterministic and reproducible images.
A simple exercise
For a simple exercise to try out Bazel, I want to convert a build script in bash to build python package build to use Bazel. Basically, the script copies some files, generate some meta files, and then invoke sdist
command to generate a python package. The python package will then be used for Docker image. I know Bazel will be more useful for project that requires compiling (like C++ or Java), but I still want to try.
The first thing I do is to create a WORKSPACE file, then a BUILD file for package. Bazel requires me to break down the project to packages, and explicitly specify the files or output of the package. The build dependencies can now be visualized as the below picture. Before Bazel, I was just referring the whole thing as a folder.
Bazel is quite strict. For example if I use py_library
rules, it throws an error if these is non-python file in srcs
. Another thing is I can not refer to different packages using relative path, Bazel requires me to to spell out the the absolute package names.
The only thing that I don’t like so far is Bazel does not have official support for building pypi package. There is one experiment but I find this is not suitable for me, as I already have the setup.py
and also the package manifest. I work around using genrules
.
genrule(
name = "arimo_pw_pkg",
srcs = [
"//arimo_pw:python",
"//arimo_pw:migration_script",
"//arimo_pw:metadata"
] + ["setup.py", "MANIFEST.in"] + glob(["requirements*.txt"]),
outs = ["arimo_pw.tar.gz"],
cmd = "set -x && cp $(locations //arimo_pw:metadata) arimo_pw/ && ls -la arimo_pw && python setup.py sdist && mv dist/arimo_pw* $(location arimo_pw.tar.gz)"
)
The benefits
Technically, Bazel just helps me to move file arounds and invoke a bash command. But it is actually better than I expect. Because I need to explicitly spell out the file names, Bazel helps me to find out that my previous build script included unused files in the final package. It will also regenerate the package files only when there are changes in the project, my bash script will always generate a new package file. Bazel also gives me a concept to reason about build dependencies. Now I am confident to adopt Bazel as my primary build tool.
There are also other parts of Bazel that I want to next is running tests, which is also as fast and reliable as the build tool. I will definitely give it a try sometime. Another thing is Bazel is extensible. Bazel syntax is Python-like and you can use it to extend the build tool (I’m a python guy, so this is good news!). I will post more when I learn more things about Bazel 🙂.