So you want to be a Linux sysadmin?

Introduction

There is a shortage of good Linux system administrators. Some friends of mine have an interest in helping to fill that shortage, so I have finally decided to embark upon a series of blog posts based on my experience. First, let’s get some things out of the way.

Qualifications

There are plenty of linux sysadmins out there, with varying degrees of experience, and varying methods and opinions. I do not claim to be the leading expert on the subject, I’m just trying to write up what I do know. Constructive comments and suggestions are welcome. People telling me what an idiot I am will have their comments deleted. That said, here are my qualifications regarding Linux system administration. I offer them both to show you I’m not just making this stuff up, and also to give you one example of how people learn Linux sysadmin skills without a formal education or certificate program.

  • I started using Linux in 1998 with a Linux for Dummies book and the included copy of Red Hat 4.something.
  • I’ve run Linux servers in some form and with varying degrees of success since college in 2000.
  • I took about half of a computer science major in college as a double-major until I dropped that to focus on research, next point.
  • When I joined a computational science research lab as an undergrad in fall of 2002, I took over the Linux/Irix system administration for the lab, and started porting old Irix applications to run on Linux.
  • When our lab moved to Johns Hopkins University after Hurricane Katrina, I set up our compute nodes (formerly run by a Tulane sysadmin) as a high-performance computing (HPC) cluster. This was the real beginning of my HPC experience.
  • Shortly thereafter, we ordered a new cluster for the lab. I handled the process of deciding specifications, working with vendors to get quotes and negotiate the deal, and then running the cluster (from Penugin Computing) once it was delivered.
  • After graduating, I did some consulting work for Penguin Computing on a variety of HPC tasks. I think this went pretty well based on the feedback I got from Penguin and end customers.
  • At this point I’m running a high-performance cluster in my garage for my work at CardioSolv Ablation Technologies, and also running my own server hosting this blog and a bunch of other stuff.

Disclaimers

  • Just to reiterate: I do not claim to be the leading expert on the subject, I’m just trying to write up what I do know.
  • With Linux, there are usually a minimum of three different ways to do a certain thing or solve a different problems. Every time I work next to another sysadmin I learn stuff from them, and at least a few times they’ve learned things from me. Nothing anyone writes about how to do stuff is gospel.
  • I will not be responsible if you destroy systems or data from following my instructions without understanding them.

With that out of the way, let’s get started.

Welcome to Linux

There are many books you can buy and sites you can read that will give you a history of Linux and an overall philosophy for how it works, etc etc. I’m going to assume if you’re reading this that you already know all that stuff and jump right in. In my experience, the best way to learn how to be a Linux sysadmin is to try to make things work. You’ll learn a lot in the process, so I’m going to do this as a series of assignments.

Assignment 1

If you haven’t already, set up a Linux computer or a virtual machine (using VirtualBox, VMware, or the like), strictly for practicing on. I recommend you start with Debian or CentOS, or possibly Ubuntu Server. I prefer Debian, but most popular Linux distributions are based on either Debian or RedHat, so learning either of those will get you off to a good start. To be good at Linux system administration, you’ll eventually need to know your way around both. We’ll get to that (and some other interesting distributions like Gentoo) in later assignments.

Set up this machine with three partitions: /, /boot, and /home. Leave a minimum of 200MB for /boot, 20GB for /, and the rest assign to /home.

When done, you should be able to log in as your primary user (i.e. brock) as well as root, the god-like default system administration account.

Please reply or email me with any questions.

I have once again sighted Baldy Mountain at Philmont

Back when I was, I think, 15, I had the opportunity as a Boy Scout to do a two-week backpacking trek at Philmont Scout Reservation in Northern New Mexico. It was a fantastic two weeks. I learned a lot, lost a ton of weight, and climbed to the top of a mountain called Baldy Mountain, which peaks at about 12,500 feet. To this day I remember how hard I had to breathe, 3 breaths per step, to make it to the very top. I don’t think I ever expected to return to Northern New Mexico.

Since moving to Northern New Mexico, living only a 2-hour drive from Philmont (and make no mistake, my memories of Philmont made me receptive to moving here), I’ve kept meaning to drive out there and see it again, but have never made the time.

This spring, while skiing in Taos, I got a breathtaking view of the mountains to the East from near Kachina Peak, and I could swear that one looked like that same Baldy Mountain I climbed, albeit covered with snow. Just this morning I finally pulled up Google Maps and confirmed that it was, indeed, the same peak. Philmont resides just over the mountains from Taos. I may not have made it back there to visit yet, but at least I’ve now laid eyes on Baldy Mountain and Philmont once again.

Adding Infiniband to my Bitcoin Mining Cluster for HPC Tasks (Part 1 – Overview)

I now have a cluster with InfiniBand network hardware in my garage. This is my bitcoin mining cluster that I’ve had running for a few years. Last summer I upgraded the CPUs from the cheapest available (Sempron 140s) to something faster but compatible with the same motherboards (Phenom II X4 975 BEs) so that I could run simulations for work, but I ran into scaling issues using 100 Mb/s Ethernet. At that point I got a cheap TrendNet gigabit switch and switched the cluster over to that, and I was able to scale across 2-3 machines (8-12 cores), but after that things really started to slow down.

I’m happy to report that with the InfiniBand hooked up as of yesterday I can now scale across all 6 IB-equipped compute nodes with approximately linear scaling. Unfortunately the older single-data-rate (SDR) hardware I got for ‘cheap’ from eBay didn’t produce the kind of dramatic reductions in latency I expected compared to gigabit Ethernet. I expect for that I’ll really need to upgrade to QDR IB at some point, but I’ll also probably want faster machines at that point. Total cost for the IB upgrade (24-port Cisco Topspin switch $350, 7 Mellanox HCAs for $39.90/ea – $279.30, 7 CX4 cables with latch connectors at $16.00/ea – $112.00) was $741.30. I intend to add some posts later with more details of the setup, as I know there some interest out there in setting up cheap IB hardware at home.

AR-15 Magazine Block on Defcad

I am building an AR-15 from a stripped lower receiver, and had read that it is good to have a vise block to hold the lower in place while installing parts. Rather than buy one, I was going to 3D print a magazine to use for this purpose. However, I was happy to see that defcad.org as a vise block ready to print. I’m warming up the printer now.

Installing Debian Wheezy (7.0) Linux on the Chromebook Pixel

UPDATE 2013-04-29

I have created a github for this here. If you have patches please submit pull requests!

UPDATE: I continue to update the kernel as more fixes make it into git. You can check all of the Pixel-related files I’m posting in this Drive folder.

UPDATE 2013-03-27: New kernel with fix for the audio pops, see my G+ post from today.

The Chromebook Pixel is a very nice (if expensive) piece of hardware, designed to run Chrome OS, which is a variant of Linux. Since being noted as favored by Linus Torvalds, inventor and lead maintainer of Linux, support for the various Pixel hardware components has rapidly been added to the kernel git repository.

Not everything is working great just yet, but all of the essential features are working. Here’s a walkthrough that I hope will be sufficient, based almost entirely on other people’s work and howtos. I’ll link to those where I can. Several Google software engineers have been helpful on Google+, and a bunch of work has been done by Linux kernel maintainers.

This was my starting point: DaveM’s howto in his Linux git repo

  1. Get a chromebook pixel
  2. Enable developer mode
  3. Download the Debian Wheezy netinstall image (yes, it supports the wifi in the installer) (here)
  4. Boot from the installer (Ctrl-L at boot screen, escape when it says to press escape to choose a boot device, choose your USB drive with the Debian installer)
  5. Install as normal to the internal SSD. I used LVM/encryption and it worked just fine. When you reboot, pull the USB drive and it should boot from the internal SSD.
  6. Once booted, the trackpad will not work. A USB mouse will work just great. Download and install my build of the 3.9-rc1 kernel (.deb files and config, full source) built from Linus’ merging of patches and configured with help from Benson Leung
  7. To your /etc/modules add:
    ath9k
    atmel_mxt_ts
    chromeos_laptop
    tpm_tis force=1 interrupts=0
    

    Those are the modules for the wifi, the touchpad (both the atmel and chromeos_laptop), and the tpm chip, that to keep it from rebooting when you try to suspend (thanks Duncan Laurie!).

  8. This is no longer necessary with the updated downloads. To your /etc/rc.local, above exit 0, add echo TSCR > /proc/acpi/wakeup. This is a hack to keep it from waking right up after going to sleep (thanks, Benson Leung!)
  9. This appears not to be necessary, actually Create a file called 01i8042 in /etc/pm/sleep.d to properly sleep and wake the keyboard on suspend. It should have this as its contents.
    #!/bin/sh
    
    ###############################################################################
    # Pm-utils script to unbind i8042 on hibernate/suspend and
    # bind it on thaw/resume.
    #
    # Copyright: Copyright (c) 2009 Nicolay Doytchev
    # License:   GPL-3
    ###############################################################################
    
    ###############################################################################
    # INSTALL:
    #   1. Copy this script to /etc/pm/sleep.d/
    #   2. Make it executable:
    #       sudo chmod +x /etc/pm/sleep.d/01i8042
    #
    # UNINSTALL:
    #   1. Delete the script from /etc/pm/sleep.d/
    #       sudo rm /etc/pm/sleep.d/01i8042
    ###############################################################################
    
    case "$1" in
        hibernate|suspend)
            echo -n "i8042" > /sys/bus/platform/drivers/i8042/unbind
        ;;
        thaw|resume)
            echo -n "i8042" > /sys/bus/platform/drivers/i8042/bind
        ;;
    esac
    

    Make it executable. Found this here

  10. You may need to install the firmware-atheros package for the bluetooth to work, which may require adding non-free to the end of the deb lines in /etc/apt/sources.list

I think that’s it, but it’s always hard to correctly recreate these things after the fact without redoing it (and I’ve spent enough time on this already). Let me know if you have any problems or improvements, or go comment on Linus’ G+ post about it.

I want to add a special thank-you to Linus Torvalds for posting about his updates on Google+.

EDIT: More stuff
First, if you hate tap-to-click like me, here’s an xorg.conf with the correct dpi setup and multi-finger click.

Here is a folder containing all the relevant files and info (plus my other configs I decide to upload) discussed in this post and the comments.

Also, FYI the keyboard backlight does not come on after suspend right now.

EDIT 2013-03-11 20:33 MDT: By the way, if you run i3 or some other nerdy window manager like me, and you forgot how to make Debian sleep when the lid is closed, you just need to uncomment the appropriate line in /etc/default/acpi-support.