Caching is one of the most basic System Design techniques, and every software engineer needs to know about Cache. There are so many great articles about Cache already, so I will mainly focus on sharing my opinion and experience on improving and monitoring your cache component.

When you introduce Caching to your architecture, you make some trade-offs. Below are some examples:

  • Increased complexity: you will have to write your application cache-aware; more code means more potential problems.
  • Increased maintenance: there is an extra component to take care of; your cache component can become a single point of failure.
  • Eventually consistency…

I have finally passed my Certified Kubernetes Application Developer, or CKAD, last week. CKAD is certification by Cloud Native Computing Foundation (CNCF) to test your experience and knowledge about Kubernetes. The test was a way to validate my knowledge and experience with Kubernetes.

I wrote this post to share my experience of preparing for the Exam.

#1 Understand the Exam structure

First, CKAD is not like a regular exam where you are presented with questions and picked one of the answers. It’s a hands-on assignment where you work through various scenarios that simulate real use cases of Kubernetes. This youtube…

image from

Before we begin, let’s start with a definition of unknown value in a machine learning system. It’s basically a categorical value that your model has not seen in both training and evaluation datasets. For example, your loan default prediction has five income categories, but it receives something else.

You may argue that this is not a problem because your dataset covers all the possible values. …

Source: DataBricks

Spark 3.0 is out, and there are ton of improvements! But there are a nice improvement that is not yet highlighted in the announcement post: Push down filter for CSV file.

Prior to Spark 3.0, when you load a CSV file, the CSV file is read to memory then apply filter, which is a waste of CPU cycle and bandwidth. Now, the data can be filtered as the files are read. This is similar to push down filter in Parquet but now for CSV files.

Here is a quick example: I load a CSV file (flights dataset from Kaggle), then…

Docker allows you to set the limit of memory, CPU, and also recently GPU on your container. It is not as simple as it sounds.

Let’s go ahead and try this

docker run --rm --memory 50mb busybox free -m

The above command creates a container with 50mb of memory and runs free to report the available memory. If you run this on Mac, you should see a similar output like the below screenshot.

Why doesn’t it show 50mb as in the memory parameter? Why it shows 2gb of memory, and where is the 2gb come from?

This is the first…

This is my collection of notes and opinions on Software Architecture. This helps to guide me through software architecture and design. I publish this to hope this will be helpful for others, and also to receive feedback as well 🙂

Architecture is about identifying the necessary components to support the business requirements, their characteristic, role, and how they interact with each other.

Software design is the realization of the architecture. There may be multiple designs that support the architecture. One can consider the architecture is the most abstract design of the system.

Architecture is about things that are not likely…

I have been using Go for a while, but mainly for tools. So I decided to invest some time to learn more about the language, and also more about system programming, distributed programming.

The chat server was just a random idea. It is simple and also complicated enough for a sandbox project. I would try to do everything from scratch

This post is more like a summary of my experience during the exercise. If you wan to look at the source code under this github repository.

So let’s start!

I will start with very basic features:

  • There is a single…

So I had a challenge on the other day to restore an EC2 instance from EBS Snapshots. I have worked with AMI and EBS for many years, but i have never tried this before.

I was given some information about the environment such as the host OS and the listening port of SSH, then that was it, I would have to figured out the rest. In fact, it was two EBS snapshots: one for the root and one for data. And I would need to bring up the instance and data up in the middle of the night…

The first…

AWS just announced a new service AWS Secret Manager in SF Dev Summit (I was there at the announcement 😇), which is a cool service to help you to manage and rotate your secrets securely.

But actually, this is not something new. There is also a less-well-known service AWS Simple System Manager (SSM) that provides a similar feature to Secret Manager. Today I would like to write a post to show you about this service and how you can use it in Python easily.

AWS SecretManager and AWS SSM Parameter Store

AWS Secret Manager helps you to store, distribute, and rotate credentials securely. You can use it…

Timezone is a hard problem. DST is even a harder problem. I found myself walking into problems and problems when I started using datetime in Python properly. So I decide to write a blog to share my experience.

The first thing to know is that in Python there are two types of datetime: offset-naive and offset-aware. Offset naive means that the datetime has no timezone information. It could be very error prone if you are new to Python. If you mix a naive datetime and aware datetime, you will get an error. …

I write, so I learn.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store