This assignment is meant to give you an understanding of containers. You will be completing the implementation of a very basic container management engine, called Hawker. You'll be implementing Hawker for Linux, and you'll need to understand the two mechanisms that the Linux kernel provides to enable them, namely namespaces and cgroups.
You will not be implementing full container functionality, but enough to get the look and feel of working with more traditional container environments. Basically, your system will allow us to run either a shell or command inside of a container. This container will be isolated in the following namespaces: UTS, PID, user, mount point (filesystem), network, and IPC. Your code will only need to deal with the first four (so we are assuming that you won't be engaging in any networking or IPC activities inside your Hawker container).
For this project, you'll want to make sure to develop in a Linux
environment. I'm using a Fedora 29 VM with Linux kernel 4.19. You can get away
with using another distro, but you probably want to use one that has
otherwise you can't really count on the cgroup system being started at bootup (you can do this manually
however, as outlined in the cgroups documentation linked below). You might first want to play around
with Docker or LXC
containers to get a feel for how they should work. You should plan on
using the LWN series on cgroups and namespaces as
You'll first want to install some dependencies, namely the development
libcurl > v7.61 and
these to grab images from the network. You should also install
libcap. For example, on Fedora:
$> sudo dnf install -y libcurl-devel libarchive-devel libcap-devel
You can then download the project code. You must run the setup script before attempting to use hawker. Otherwise, things will break. The setup script needs sudo access, so you will be asked for your sudo password when it runs.
The last command will build the skeleton, but don't expect it to work properly yet! You will have to complete some functionality first. You should be able to run
$> curl http://cs.iit.edu/~khale/class/vm-class/f18/hawker-skeleton.txz > hawker-skeleton.txz $> tar xJf hawker-skeleton.txz $> cd hawker-skeleton $> ./setup.sh $> make
hawkernow, but it won't be very useful. To get an idea of how its used, you can run it without any arguments or like so:
$> ./hawker -h
I've also provided a reference binary for you if you'd like to see a working copy doing its thing. You can get it like so:
I've provided a single container image for you to test with (called
$> curl http://cs.iit.edu/~khale/class/vm-class/f18/hawker-ref > hawker-ref $> chmod +x hawker-ref $> ./hawker-ref -h
test). You can use it with the reference binary to create an interactive shell inside the container like so:
If you navigate around this container shell, you'll see that this is pretty much just the BusyBox image you created for your previous QEMU homework.
$> ./hawker-ref test /init
Your first job is to get
hawker to spawn off another process that will form
the shell of the container. You'll then use namespaces to isolate this process from the
rest of the system and set up its environment.
For this part, you'll be filling in some functionality for using namespaces properly to isolate the container from the rest of the system.
When you first get the code, it's not going to actually create a container. You should
start by looking in
hawker.c. This is where the heart of the engine is, and
where you'll be doing all of your work (you can ignore
you're curious). All of the functionality that you need to implement is tagged with
FILL ME IN comments in the source.
You should start with
hawker.c. The first thing the engine does is initialize its image server,
which involves starting a network subsystem and an image cache (in a
under your home directory). The image must be specified at the command line (just like the
command). Hawker will then look for the image in its cache, and if it doesn't find it, it will
try to download the image from my image server (hosted on my IIT page). If it succeeds,
it will extract the compressed image into the
This is where the fun begins. After user arguments are parsed, we need to create a new child process
for the container. We will be using the
clone() system call for this purpose. You'll want
to make sure the proper flags to
clone() are set. The source tells you which namespaces
you'll need, and you can use the manpage for
clone() to determine the exact flags.
At this point, the program will exit. You should remove this call to
allocate a stack for the new process according to the comments in the source. It's up to you
whether to use a
malloc() variant or
mmap() to allocate your
child process stack; both have their merits. A default stack size is given for you
Once we have a stack, we can
clone() with it. I've done this for you. Note that after
the clone completes, we'll have two processes running. The child process will be running in
separate namespaces (according to the clone flags we set up). However, before we let the child
container process loose, we must set up its environment, so we have to make it wait for us (the
parent process) to finish that setup. I acheived this using
pipe() (which you
should remember from CS 450). We're using it here as an asynchronous event notification mechanism. Essentially,
the child waits on one end of the pipe for the parent to hang up the pipe, at which point the child can
continue doing whatever it wants. Other notification mechanisms
The main thing we need to do here is set up the UID namespace mapping. By default, Linux does not set up a mapping from UIDs outside the container to UIDs inside the container, so the process will be setup with a default UID (2^16-1). However, we'd like our container to be running as root (UID 0) with group ID (GID) 0.
There are three files we need to modify
(after the child is running) to set up these maps.
the LWN article and the Linux man page on PID namespaces for more details. You might want to use the
constant provided in
Note that we don't need to do this for the PID namespace, since Linux does this for us. The command we run will run with PID 1 (essentially as the init process). This creates some issues with signal handling and zombie processes since the kernel treats PID 1 as special. While we won't be worrying about this, some commercial container systems get around this by spawning a special init process which then launches the user command as PID 2.
If we build and run hawker at this point, a new process will be created in a separate set of namespaces (
you should verify this with the
ps command inside and outside of the container, and the
command inside the container),
but things will still be broken because we haven't set up the container's environment (we're not
making use of the container image that the network/image subsystem is providing us). Your next task
will address this.
We now move our attention to the code for the child process (
at this point, the child is waiting for the parent to notify it that it should continue, but then it simply exits.
You will fix this.
We need the child to set up its new environment. To do that, we need to do four things:
~/.hawker/images/<image-name>. You should add code that changes to this directory using the
chroot()system call. You might want to make use of the
hkr_get_img()function provided from
img.h. This will require the image argument stored in the
struct parmsstruct filled out by the argument parser.
chdir()system call here, but be careful! What directory path should we actually use here?
sethostname()system call. A default hostname is provided for you in
execvp()system call for this, which will create a new address space for us (the child process), load the binary file of the command provided, and execute it. Note that
execvp()also takes arguments; these should be the arguments of the command (which can also be derived from the
At this point, hawker should work correctly with namespaces. You can now test your version with the test image provided for you. For example, I might run:
Which should print out the contents of the root directory for the
$> ./hawker test /bin/ls
testcontainer image. If I want to get an interactive shell for the container, I can run:
Note that this is a bit unlike how this is done in Docker. There are some subtleties when working with an
$> ./hawker test /init
initprocess and when allocating
ttys (see the extra credit if you'd like to help make this closer to Docker). You should be able to run commands like
psfrom the container now. Note the PIDs, hostnames, etc. that you see. You should also be able to use other commands provided by BusyBox (the
testimage is a BusyBox image).
There's another special command in the container image called
hog. All it is going to do is use as much
CPU as possible. Open up another terminal and run
htop in it. Then in your original terminal,
Watch the CPU meter in your other terminal. Indeed, the
$> ./hawker test /hog
hogcommand is doing what we thought it would, pegging your CPU. We want to prevent this from happening with our containers, so we need to introduce some notion of resource control.
We want to be able to limit the amount of certain resouces that our container can use. For this project, you'll
only be dealing with the maximum amount of memory the container can use, and the amount of CPU it can use. Run
./hawker -h to get an idea of how to use them. As it stands, if we pass values for the
-c flags, they'll just be ignored. Your task here will be to fix this using Linux's cgroup subsystem
(make sure to do the reading on cgroups above).
For this task, you'll want to bring your attention back to the
main() function in
hawker.c, right at the point with the comment that says
BEGIN RESOURCE CONTROL.
There are two things you must do here, setting CPU limits and memory limits. The cgroups subsystem
is manipulated through the VFS subsystem (by reading and writing virtual files) rather than
through standard system calls. Make sure to do the reading on cgroups to get an idea of how they're
The appropriate cgroup directories are already created for you (this is what the
You'll have to modify a few files in these directories based on the
values passed to the
-c flags by the user. The files of
interest for CPU live in
/sys/fs/cgroup/cpuacct/hawker/<container-PID>. The files
For memory, the file lives in
file of interest is
memory.limit_in_bytes. This will prevent your container from
using too much memory.
You'll need to write appropriate values to these files to setup the proper resource control. However,
you're not done yet, because the resource controls you've set up have to be associated with a PID that
is bound to them! For that, you'll need to modify the
tasks file in both of the above cgroups
directories. You should think carefully here; which PID should be written to those files? Hint: why is the parent the one
writing these files?
That's it for this part!
If you did the above task correctly, you should now be able to control the resources of your container. For example,
if we re-run our previous example (using the
hog program), we should see its resources being limited.
Open up another terminal with
htop again and run this:
You should now see that your container is only using 50% of the CPU! Cool.
$> ./hawker -c 50 test /hog
That's it! You're done now. If you've finished quickly, take a look at the extra credit examples. Otherwise, you're ready to handin.
To hand in your project, you should run
make handin with the
environment variable set to the first letter of your first name followed by your full lastname. For example,
I would run
MY_NAME=khale make handin (or if you're using
env MY_NAME=khale make handin). Send me
p4-<you>-handin.txz file over e-mail.
This project is due by next Friday, Dec. 7 2018 at 11:59PM.
If you have time and want to extend your implementation, I'm willing to give you extra credit. You can propose your own extensions, but here are some ideas to start with:
docker -i) and
mknod) from the new mount namespaces
Dockerfiles and company)