Lab 0: Basic C Programming & The Linux Environment

Due Date: None (no required hand-in for this lab)

Overview

In this lab, you'll be learning to write and execute basic C programs in a Linux environment (on a remote machine). This is a fairly common mode of programming, especially for systems programmers. For future courses (such as CS 351, the course following this one, and systems courses like networking, databases, and OS), a proficiency with the C language is an absolute must. While it can be painful at first, especially if you are used to programming in higher-level or scripting languages like Python, Ruby, Node.js, or Racket, by learning and using C, you will have a new-found appreciation for those languages (and the people who implemented them, often in C!).

Later on, we'll see how these C programs are translated into a lower-level language that your processor understands—namely, assembly then machine language.

Note that you are not required to turn anything in for this lab.

The C programming language

C is typically called a "low-level" language. What this really means is that it is much easier to do things closer to the hardware than in a language like Java. A vast majority of "low-level" code is written in C, even though it is a language that was designed in the 1960s! Most major OSes, compilers, and database engines are written in C. This is partly because C is fast. This comes as a trade-off, however, because it also allows you to shoot yourself in the foot. Because of this, you'll find that C programs will tend to be longer than their Python or Java counterparts. This is partly because it doesn't give you as many "bells and whistles," and partly because people who know what they're doing in C tend to write "defensive" code. I like to think of programming in C like driving a car with a stick. Superfluous for day-to-day activities like going to the store, taking a road trip, etc., but absolutely necessary when you're on a racetrack and you need precise control over the machine.

Why should you learn to program in C? Well, think about it this way. There are a lot of people who can drive around with an automatic transmission. How many know not only how to drive stick, but also how to extract the car's full potential? You can imagine that those people will be valuable when such skills are required! Similarly in computer science, when you're trying to chase after extraordinary speed and control, you will be extremely limited in a language like Python, Java, or whatever new hot language on the block that ties your shoes and does your grocery shopping. You therefore want to be one of these people who can program and think at the lowest levels of the machine. That is a common theme in this class.

Logging into the course server

As noted on the course homepage, we'll be using fourier.cs.iit.edu as our course server. You should already have an account that allows you to log in. If this is not the case, let me know immediately so I can get one created for you.

fourier is a Linux server. Linux is an open-source OS kernel is used on most internet machines today (and increasingly personal devices. For example, Android phones now run a Linux-based OS. It is in your interest, therefore, to become familiar with it early on. Linux is a bit different from the OSes you're probably used to at this point (MacOS, Windows). In those OSes, a large portion of the environment is part of the OS, i.e. those things that determine the "look and feel" of your system. This is not true with Linux. Linux just gives you a bare-bones kernel that will land you on a text-based command shell, without any programs. To get the familiar desktop environment that people are used to, third parties build "distributions" on top of Linux. These give you things like windowing systems, file explorers, media suites, etc. Popular examples these days are Ubuntu, Fedora, Red Hat Enterprise Linux, Mint, Debian, Gentoo, OpenSUSE, and Arch. All of these but RHEL are free. Arch is the most customizable (and most time-consuming if you're a novice).

To log in to fourier, use the ssh (secure shell) command. This lets you log in to remote systems over an encrypted channel, making it extremely difficult to eavesdrop on (some people set up vpn tunnels using ssh). If you're on a Mac, you already have it in your terminal (the Terminal app or iTerm2). Similarly if you're on a Linux box. If you're on a Windows machine, you can use PuTTY or Cygwin to use ssh. Either way, if you have trouble, your Lab TAs can help.

Practice logging into fourier with the following:


$> ssh your-user-name@fourier.cs.iit.edu

This should land you at a Linux command prompt (a "shell"). The shell is just a command interpreter, that also happens to have its own little programming language. The most common shell used in Linux is bash. In addition to running commands in bash, you can also write scripts in the bash language. You should explore this in your own time. What to do next? See the course homepage for links for getting started at the Linux shell. You should also read through our Linux handout for a brief introduction.

Getting the code

Once you log in to fourier, you can get the code with the following command:

$> cp /home/khale/HANDOUT/lab0.tgz .

This is copying (using the cp command) the lab0 file from my directory on fourier to your current working directory (the . is shorthand for your current directory). Note the .tgz extension. This is a file that has been created by a combination of tar and gzip. The file is first compressed with gzip, then formed into an archive with tar. This is very similar to a .zip file. tar and gzip are so widely used together that they are both invoked via the tar command. The original file extension would be .tar.gz, though I've shortened it here, which is common practice. You'll see these types of archives often (more often than .zip archives). Sometimes they are called "tarballs."

To untar the archive, you can run the following:


$> tar xvzf lab0.tgz

This will untar the lab0 archive into a folder named l0. Here, the x flag means "extract" (untar), the v flag means "verbose" (tell me the files you're extracting as you extract), the z flag means "invoke gzip to decompress before untarring", and the f flag means "I'm giving you the filename."

You can then see the contents of the directory by doing the following:


$> cd l0
$> ls 
$> Makefile lab0.c

Here, I first navigate to the l0 directory with the cd (change directory) command. Then I list the files in this directory with the lscommand (list). The output is two files, the C code for this lab (lab0.c) and a Makefile (more on that later).

To simply read the contents of the C file, you can run the following command:


$> cat lab0.c
...

You'll see some output, and it will likely overflow your terminal window. If you want to be able to scroll through the output, use a line buffering program like less:

$> less lab0.c

To move up and down in less, you can use the j and k keys to move up and down, respectively. To move by page, use ctrl+f and ctrl+b or PageUp and PageDown. You can hit q to quit out of less.

Now that you've seen the code, let's compile it. You can do that with the following:


$> gcc -Wall -std=c99 -lm -o lab0 lab0.c

Here, gcc is a command that invokes the "GNU C Compiler," which is pretty much the default C compiler these days (although Clang is rapidly gaining popularity). The -Wall option means "print all warning messages." These do not have to be actual errors, they could be instances of the compiler telling you you might be doing something wrong (or something that is frowned upon). The option -std=c99 means "use the ISO C99 standard of the language. It's important to include this, because we want to all compile our code the same way. Different C standards can result in different program behavior given the same code! -lm (ell em) means "include the C math library." This gives you math functions like sqrt and pow. -o lab0 means "compile my program into an executable named lab0." If you don't provide the -o flag, it will use the default executable name, a.out. Finally, the last argument is the C program that we're compiling (the input file).

To run the program, you can do the following:


$> ./lab0

The dot slash (./) is necessary here, because if you were to just invoke lab0, Linux would not know where to find the executable, since your current directory (l0) is not on its search path.

As you might imagine, typing in that command over and over will become tiresome. The standard practice for getting around this is to use a build system. You're probably used to a build system that is integrated into your IDE. Hit a button, and everything in your project compiles nicely. We'll be using GNU make for this (See the note below). To build the lab code using Make, simply run:


$> make

You should see the same program executable (lab0) as in the previous example. To remove it, you can run:

$> make clean

Note that Make is smart enough to detect changes to your source file. For example, if you run make again, (without having run make clean), you'll notice that it will spit out a message that says "make: Nothing to be done for 'all'". Make knows that nothing has changed, so it doesn't need to rebuild. If you change the file, the next time you run Make it will know to actually do the compilation.

We'll be using make throughout this class, so be sure to become familiar with it.

Note: GNU Make

Make is a build system for compiling software, esepcially when dependencies are involved. It allows you to encode all the files that need to be compiled (and their dependencies) in one place (the Makefile). Once you have that Makefile, you simply run make to build your code. A huge amount of software (especially open-source software) uses make for building, so learning it sooner rather than later will be in your interest.

You'll want to become familiar with basic Make usage. Here is the GNU manual for Make. Here is a nice, brief introduction.

Once you learn the basics of Make, you'll be able to move on to more advanced uses. This is when you'll really start to appreciate this tool's power. It can be very flexible.

Make is not only used for compiling code. It can be also used for executing tasks with dependencies at the command-line. For example, I use it to build and transfer all the code, documents, and webpages to the course servers for this class. I also use it in my research extensively, for example for generating data and graphs, and for running series of scripts.

The sample program

Read through lab0.c. You'll notice that C shares many similarities from Java, but there are many differences as well. For now, ignore the problems.

The program contains several important constructs from the C language, including:

Find some reference material on basic C programming (for example, the material listed on the Learning C section) to understand how this program works.

Problems

There are problem descriptions in the comments of the lab0.c program. Write out answers to the problems and check them with the answers I'll post next week. Note, you do not need to turn these answers in.

Note: Text Editors

For the problems listed in lab0.c, you'll actually need to edit the C file. So far, if you have no other experience than what we've seen so far on this page, you'll need to learn to use a command-line based text editor (if you choose to do your work on fourier). These are often called line editors. These stand in contrast to "What You See Is What You Get" (WYSIWYG— pronounced "whizeewhig") editors like MS Word, Textmate, TextEdit, Sublime, Pages, or LibreOffice. Instead of a point-and-click environment, you use special key commands to navigate the document and edit text. While the learning curve for these editors can be higher than a WYSIWYG editor, they can boost your productivity tremendously, especially when writing code and automating text editing tasks.

The two major text editors used these days are vim and emacs. They feel very different to use, and which you choose is a matter of taste (the debate over which one is better is pretty much religious at this point). Either way, they both allow you to do similar things. You want to get started on learning one of these ASAP. Here , here, and here are some excellent resources for getting started with emacs. Here is a set of videos, an interactive tutorial, and a game for learning vim.

If learning these is simply too much at this point, consider using nano.