Sywtbals? – Assignment 2

This is the second in my “So you want to be a Linux sysadmin?” series. See the first post here and the category here.

Now that you have completed Assignment 1 and have a working sandbox Linux system installed, it’s time to cover the basics.

First, when I started using Linux, I got a boxed set of RedHat that included a nice cheat sheet sticker to go on a keyboard wrist rest. In lieu of that, I just saw this posted today. I recommend you keep it bookmarked and perhaps printed out for reference: Linux Command Shelf Cheat-Sheet. There’s a PDF for download on that site. I also recommend you somehow (RSS or Facebook) subscribe to NixCraft. I still regularly learn useful things from that blog.

Now, when taking care of a Linux (or other *NIX) system, you’re often going to need to download and extract file archives. Typically these are either singly-zipped files (.gz or .bz2 from gzip and bzip2, respectively), or so-called “tarballs” (.tar, .tar.gz, .tgz, .tar.bz2, etc). Tarballs are created with the tar program, whose name comes from Tape ARchive. The additional suffixes indicate that the tarball has been compressed in some way. In other words, a simple .tar file contains a bunch of files all rolled up into a single archive, but not compressed, while .tar.gz (.tgz for short) files have additionally been zipped up in some fashion. Incidentally, tar is also still used to write files to tape, if that’s your thing.

The best way to get a file for download from the command-line (which I will focus on almost entirely in this series) is wget. So, open up a terminal window. You should have a simple prompt waiting for your input. Check to see whether you have wget by typing


which wget

The which command searches through all of the directories on your PATH, an environment variable that tells the shell where to find programs, for the command you list, in this case wget. It’s a good way to check for the presence of a program on a system. If that doesn’t work, but you think it’s around somewhere, you can sometimes use locate, but we’ll not cover that today.

Now, if you have wget, you should get in return the full path to the program, something, like /usr/bin/wget. If so, you can proceed to the next step where it says ‘Downloading a file with wget’.

If not, the utility will print nothing. I tried a made-up program called ‘snarfblat’:


brock@gamont:~$ which snarfblat
brock@gamont:~$

Here’s a successful which call for wget:


brock@gamont:~$ which wget
/usr/bin/wget
brock@gamont:~$

If you don’t have it, try sudo apt-get install wget on a Debian-based distribution or sudo yum install wget on a RedHat-based distribution. If you’re not set up with sudo, we will cover that later. For now become root with su - followed by the root password, and then run the above commands without the sudo at the beginning. Type exit or press Ctrl-D to exit the root shell once you have installed it. Double-check that it’s now installed by calling which wget again. If you get stuck, please email me or comment here so we can take care of it and you can continue.

Downloading a file with wget

For this tutorial, I’m going to have you download a sample tarball that I have created. It contains a bunch of directories and files with various names and contents. We will use it in this assignment and later assignments to practice command-line basics.

For now, go to a terminal (I recommend you make a directory for this like mkdir ~/sywtbals, where ~/ is a shortcut for specifying your home directory. In my case that is /home/brock. Then change to that directory: cd ~/sywtbals.

Now, run wget to pull down the file.


wget http://blog.brocktice.com/sywtbals/1.tgz

It should look like this:


brock@water:~/sywtbals$ wget http://blog.brocktice.com/sywtbals/1.tgz
--2013-05-05 17:30:33--  http://blog.brocktice.com/sywtbals/1.tgz
Resolving blog.brocktice.com... 209.188.113.105
Connecting to blog.brocktice.com|209.188.113.105|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 215040 (210K) [application/x-gzip]
Saving to: “1.tgz”

100%[======================================>] 215,040      259K/s   in 0.8s    

2013-05-05 17:30:34 (259 KB/s) - “1.tgz” saved [215040/215040]

brock@water:~/sywtbals$

Give yourself a self-high-five if you noticed I changed machines in the middle of this post from gamont to water.

Now that you’ve got the file, let’s take a look at it with ls, the file-listing tool:


brock@water:~/sywtbals$ ls -l 
total 212
-rw-r--r-- 1 brock brock 215040 May  5 13:40 1.tgz
brock@water:~/sywtbals

I used the -l for the ‘long’ listing that shows details about each file. This shows us a few interesting things. One is the permissions:


-rw-r--r-- brock brock

This tells us the permissions for the user, group, and everyone else. A file may only have one group and one user assigned to it. There’s SELinux (Security-Enhanced Linux), the NSA’s enhancements to make Linux permissions more fine-grained, but honestly I don’t understand SELinux well enough to do more than turn it off, so I’m not going to discuss it here.

So, in this case the user has read and write permissions, the group has read permissions, and everyone else also has read permissions. The user is brock and the group is brock. Note that even though the user and group names are the same they are totally separate. One represents a userid, and the other represents a groupid. Observe the use of the id command to elaborate:

brock@water:~/sywtbals$ id
uid=1000(brock) gid=1000(brock) groups=1000(brock),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),100(users),108(netdev),109(bluetooth),112(fuse),115(scanner),120(libvirt),1001(family)
brock@water:~/sywtbals$

Here you can see that (and again, this is just a coincidence because I was the first user added to the system), the user id brock is 1000, and the group id brock is also 1000. However, the group id brock could just as easily be 2005 or something.

Let’s say this file is somewhat secret and I don’t want anyone else (except root of course) on the system to be able to read it. We’ll change it like this:

brock@water:~/sywtbals$ chmod go= 1.tgz
brock@water:~/sywtbals$ ls -l
total 212
-rw------- 1 brock brock 215040 May  5 13:40 1.tgz

What I did is to tell chmod to set the group and other permissions to be equal to nothing. As a result, only the user has any permissions at all.

On the other hand, what if we want to allow anyone on the system to read and write the file?


brock@water:~/sywtbals$ chmod go+rw 1.tgz 
brock@water:~/sywtbals$ ls -l
total 212
-rw-rw-rw- 1 brock brock 215040 May  5 13:40 1.tgz
brock@water:~/sywtbals$

You can also use numerical codes to set permissions. All I ever remember about that offhand is that 0 means nothing and 7 is everything. I prefer the other notation, but you can look it up. A common 4-letter admonition you’ll see as a Linux sysadmin is RTFM — read the fucking manual.

Usually you can find the so-called manfile for a program or configuration file like this:


man chmod

This will show the output in the system pager, a program that pages through a text file. Historically this was normally a program called, sensibly, more. Later, an improved program called less replaced it. UNIX humor — less is more. Anyway, there are only four things you need to know to get around less reasonably well. The q key quits. The spacebar moves to the next page. And the / key starts a search. The Esc key gets you back to the main mode. There’s a lot more you can do with less, but those four things will make up 99% of your less usage.

The most important thing you can do with the / search in many manpages as a beginner is to find the Examples section, if there is one. It will usually contain examples of what you’re trying to do. Of course you can also RTFM. Here’s me searching for the examples in the chmod manpage. Note the /examples at the end.


CHMOD(1)                         User Commands                        CHMOD(1)

NAME
       chmod - change file mode bits

SYNOPSIS
       chmod [OPTION]... MODE[,MODE]... FILE...
       chmod [OPTION]... OCTAL-MODE FILE...
       chmod [OPTION]... --reference=RFILE FILE...

DESCRIPTION
       This manual page documents the GNU version of chmod.  chmod changes the
       file mode bits of each given file  according  to  mode,  which  can  be
       either a symbolic representation of changes to make, or an octal number
       representing the bit pattern for the new mode bits.

       The format of a symbolic mode is  [ugoa...][[+-=][perms...]...],  where
       perms  is  either zero or more letters from the set rwxXst, or a single
       letter from the set ugo.  Multiple symbolic modes can be  given,  sepa‐
       rated by commas.

       A  combination  of the letters ugoa controls which users' access to the
       file will be changed: the user who owns it  (u),  other  users  in  the
/examples

Unfortunately, when I hit the Enter key after typing /examples I was greeted with Pattern not found (press RETURN). So I’ll have to properly RTFM. I press the spacebar to scroll down. (Arrow keys and page up/down can also be used to navigate.) Eventually I find this documentation on the octal numerical codes:


       granted  to  users  that are in neither of the two preceding categories
       (o).

       A numeric mode is from one to  four  octal  digits  (0-7),  derived  by
       adding up the bits with values 4, 2, and 1.  Omitted digits are assumed
       to be leading zeros.  The first digit selects the set user ID  (4)  and
       set group ID (2) and restricted deletion or sticky (1) attributes.  The
       second digit selects permissions for the user who owns the  file:  read
       (4),  write  (2),  and  execute  (1); the third selects permissions for
       other users in the file's group, with the same values; and  the  fourth
       for other users not in the file's group, with the same values.

       chmod never changes the permissions of symbolic links; the chmod system
       call cannot change their permissions.  This is not a problem since  the
       permissions  of  symbolic links are never used.  However, for each sym‐
       bolic link listed on the command line, chmod changes the permissions of
       the pointed-to file.  In contrast, chmod ignores symbolic links encoun‐
       tered during recursive directory traversals.

SETUID AND SETGID BITS
       chmod clears the set-group-ID bit of a regular file if the file's group
       ID  does  not  match the user's effective group ID or one of the user's
       supplementary group IDs, unless the user  has  appropriate  privileges.
 Manual page chmod(1) line 44

There you go, the numerical codes are read = 4, write = 2, and execute = 1. Execute means that if it's a script or a program, it can be run, or if it's a directory, you can navigate (change directory or cd) into it. Anyway, to set read/write for user and group, and nothing for everyone else, you'd use (4+2=6)(4+2=6)(0) or chmod 660.

So, now that we've beaten the permissions thing to death. Just a few other notes. In this output:

-rw------- 1 brock brock 215040 May 5 13:40 1.tgz

The last few bits are the file size (215040 bytes) and file modification time, ending with the filename. You can get a handy "human-readable" size using the -h for human flag like so:


brock@water:~/sywtbals$ ls -lh 1.tgz 
-rw-rw-rw- 1 brock brock 210K May  5 13:40 1.tgz
brock@water:~/sywtbals$

I usually use the -h when doing a detailed listing. Because it rounds, however, sometimes you may want to omit the -h to see if a file size matches, has changed, etc.

OK! Let's extract this sucker. We'll use a tar command to extract. First, let's get a preview of the file contents. Nice people generally put everything to tar in a directory first, so that the directory extracts nicely and doesn't make a mess. Sometimes, (and occasionally even with good reason), people will put all the contents of the tar right at the top level, and extracting without knowing that will make a big mess. (I'll cover a quick tip to fix that when we get to pipes. Don't let me forget.).

To get a preview of the file contents, we use the t flag to tar:


brock@water:~/sywtbals$ tar tvf 1.tgz
drwxrwxr-x brock/brock       0 2013-05-05 13:31 1/
drwxrwxr-x brock/brock       0 2013-05-05 13:30 1/sequential/
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_85
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_34
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_60
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_43
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_30
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_64
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_11
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential/sequential_file_77
...
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00093
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00049
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00013
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00068
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00018
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00055
-rw-rw-r-- brock/brock      12 2013-05-05 13:31 1/sequential_padded/sequential_file_00005
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00098
-rw-rw-r-- brock/brock      12 2013-05-05 13:31 1/sequential_padded/sequential_file_00001
-rw-rw-r-- brock/brock      13 2013-05-05 13:31 1/sequential_padded/sequential_file_00026
brock@water:~/sywtbals$

The command tar tfv 1.tgz can be broken down as follows: tar is the program, t tells it to test the file, f tells it it's getting a filename for input (1.tgz in this case), and v means verbose, in that it prints out all the names of the files being extracted. You'll always need f to extract a file. The philosophies on v vary. Some people say using it causes errors to get lost in the shuffle and go unnoticed. Others like to see which files are being extracted. If you do a run like this with t first, then you can go ahead and do the actual extraction later without v and see any errors.

I'm a nice guy, so I put the contents in a directory called 1. (I'm also an idiot, this is Assignment #2, but I digress). So let's go ahead and extract the file.


brock@water:~/sywtbals$ tar xf 1.tgz 
brock@water:~/sywtbals$

In this case I omitted the v option, and changed from test mode to extract mode. tar followed the UNIX convention of outputting no message at all if everything worked correctly.

Do a quick listing on the created directory just to double-check:


brock@water:~/sywtbals$ ls -l 1
total 8
drwxr-xr-x 2 brock brock 4096 May  5 13:30 sequential
drwxr-xr-x 2 brock brock 4096 May  5 13:31 sequential_padded
brock@water:~/sywtbals$

Looks good! Questions?

I have a bonus question. tar figured out something about our 1.tgz file and took care of it without us asking. Do you know what that is?