Donderdag 28 Maart 2019

LAN-party House: Technical Design And FAQ

After I posted about my LAN-party optimized house, lots of people have asked for more details about the computer configuration that allows me to maintain all the machines as if they were only one. I also posted the back story to how I ended up with this house, but people don't really care about me, they want to know how it works! Well, here you go!

Let's start with some pictures...

Sorry that there are no "overview" shots, but the room is pretty small and without a fish-eye lens it is hard to capture.

Hardware

In the pictures above, Protoman is a 2U rackmount server machine with the following specs:

  • CPU: Intel Xeon E3-1230
  • Motherboard: Intel S1200BTL
  • RAM: 4GB (2x2GB DDR3-1333 ECC)
  • OS hard drive: 60GB SSD
  • Master image storage: 2x1TB HDD (RAID-1)
  • Snapshot storage: 240GB SATA-3 SSD

I'll get into the meaning of all the storage in a bit.

The other machines on the rack are the gaming machines, each in a 3U case. The specs are:

  • CPU: Intel Core i5-2500
  • GPU: MSI N560GTX (nVidia GeForce 560)
  • Motherboard: MSI P67A-C43 (Intel P67 chipset)
  • RAM: 8GB (2x4GB DDR3-1333)
  • Local storage: 60GB SSD

Megaman and Roll are the desktop machines used day-to-day by myself and Christina Kelly. These machines predate the house and aren't very interesting. (If you aren't intimately familiar with the story of Megaman, you are probably wondering about the name "Roll". Rock and Roll were robots created by Dr. Light to help him with lab work and housekeeping. When danger struck, Dr. Light converted Rock into a fighting machine, and renamed him "Megaman", thus ruining the pun before the first Megaman game even started. Roll was never converted, but she nevertheless holds the serial number 002.)

The gaming machines are connected to the fold-out gaming stations via 35-foot-long HDMI and USB cables that run through cable tubes built into the house's foundation. Megaman and Roll are connected to our desks via long USB and dual-link DVI cables. I purchased all cables from Monoprice, and I highly recommend them.

Network boot

Originally, I had the gaming machines running Ubuntu Linux, using WINE to support Windows games. More recently, I have switched to Windows 7. The two configurations are fairly different, but let me start by describing the parts that are the same. In both cases, the server runs Ubuntu Linux Server, and all server-side software that I used is free, open source software available from the standard Ubuntu package repository.

As described in the original post, the gaming machines do not actually store their operating system or games locally. Indeed, their tiny 60GB hard drives couldn't even store all the games. Instead, the machines boot directly over the network. All modern network adapters support a standard for this called PXE. You simply have to enable it in the bios, and configure your DHCP server to send back the necessary information to get the boot process started.

I have set things up so that the client machines can boot in one of two modes. The server decides what mode to use, and I have to log into the server and edit the configs to switch -- this ensures that guests don't "accidentally" end up in the wrong mode.

  • Master mode: The machine reads from and writes to the master image directly.
  • Replica mode: The machine uses a copy-on-write overlay on top of the master image. So, the machine starts out booting from a disk image that seems to be exactly the same as the master, but when it writes to that image, a copy is made of the modified blocks, and only the copy is modified. Thus, the writes are visible only to that one machine. Each machine gets its own overlay. I can trivially wipe any of the overlays at any time to revert the machine back to the master image.

The disk image is exported using a block-level protocol rather than a filesystem-level protocol. That is, the client sends requests to the server to read and write the raw disk image directly, rather than requests for particular files. Block protocols are massively simpler and more efficient, since they allow the client to treat the remote disk exactly like a local disk, employing all the same caching and performance tricks. The main down side is that most filesystems are not designed to allow multiple machines to manipulate them simultaneously, but this is not a problem due to the copy-on-write overlays -- the master image is read-only. Another down side is that access permissions can only be enforced on the image as a whole, not individual files, but this also doesn't matter for my use case since there is no private data on the machines and all modifications affect only that machine. In fact, I give all guests admin rights to their machines, because I will just wipe all their changes later anyway.

Amazingly, with twelve machines booting and loading games simultaneously off the same master over a gigabit network, there is no significant performance difference compared to using a local disk. Before setting everything up, I had been excessively worried about this. I was even working on a custom UDP-based network protocol where the server would broadcast all responses, so that when all clients were reading the same data (the common case when everyone is in the same game), each block would only need to be transmitted once. However, this proved entirely unnecessary.

Original Setup: Linux

Originally, all of the machines ran Ubuntu Linux. I felt far more comfortable setting up network boot under Linux since it makes it easy to reach into the guts of the operating system to customize it however I need to. It was very unclear to me how one might convince Windows to boot over the network, and web searches on the topic tended to come up with proprietary solutions demanding money.

Since almost all games are Windows-based, I ran them under WINE. WINE is an implementation of the Windows API on Linux, which can run Windows software. Since it directly implements the Windows API rather than setting up a virtual machine under which Windows itself runs, programs execute at native speeds. The down side is that the Windows API is enormous and WINE does not implement it completely or perfectly, leading to bugs. Amazingly, a majority of games worked fine, although many had minor bugs (e.g. flickering mouse cursor, minor rendering artifacts, etc.). Some games, however, did not work, or had bad bugs that made them annoying to play. (Check out the Wine apps DB to see what works and what doesn't.)

I exported the master image using NBD, a Linux-specific protocol that is dead simple. The client and server together are only a couple thousand lines of code, and the protocol itself is just "read block, write block" and that's it.

Here's an outline of the boot process:

  1. BIOS boots to the ethernet adaptor's PXE "option ROM" -- a little bit of code that lives on the Ethernet adapter itself.
  2. The Ethernet adaptor makes DHCP request. The DHCP response includes instructions on how to boot.
  3. Based on the instructions, the Ethernet adaptor downloads and runs a pxelinux (a variant of syslinux) boot image from TFTP server identified by DHCP.
  4. pxelinux downloads and runs the real Linux kernel and initrd image, then starts them.
  5. initrd script loads necessary drivers, connects to NBD server, and mounts the root filesystem, setting up the COW overlay if desired.
  6. Ubuntu init scripts run from root filesystem, bringing up the OS.

Crazy, huh? It's like some sort of Russian doll. "initrd", for those that don't know, refers to a small, packed, read-only filesystem image which is loaded as part of the boot process and is responsible for mounting the real root filesystem. This allows dynamic kernel modules and userland programs to be involved in this process. I had to edit Ubuntu's initrd in order to support NBD (it only supports local disk and NFS by default) and set up the COW overlay, which was interesting. Luckily it's very easy to understand -- it's just an archive in CPIO format containing a bunch of command-line programs and bash scripts. I basically just had to get the NBD kernel module and nbd-client binary in there, and edit the scripts to invoke them. The down side is that I have to re-apply my changes whenever Ubuntu updated the standard initrd or kernel. In practice I often didn't bother, so my kernel version fell behind.

Copy-on-write block devices are supported natively in Linux via "device-mapper", which is the technology underlying LVM. My custom initrd included the device-mapper command-line utility and invoked it in order to set up the local 60GB hard drive as the COW overlay. I had to use device-mapper directly, rather than use LVM's "snapshot" support, because the master image was a read-only remote disk, and LVM wants to operate on volumes that it owns.

The script decides whether it is in master or replica mode based on boot parameters passed via the pxelinux config, which is obtained via TFTP form the server. To change configurations, I simply swap out this config.

New setup: Windows 7

Linux worked well enough to get us through seven or so LAN parties, but the WINE bugs were pretty annoying. Eventually I decided to give in and install Windows 7 on all the machines.

I am in the process of setting this up now. Last weekend I started a new master disk image and installed Windows 7 to it. It turns out that the Windows 7 installer supports installing directly to a remote block device via the iSCSI protocol, which is similar to NBD but apparently more featureful. Weirdly, though, Windows 7 apparently expects your network hardware to have boot-from-iSCSI built directly into its ROM, which most standard network cards don't. Luckily, there is an open source project called gPXE which fills this gap. You can actually flash gPXE over your network adaptor's ROM, or just bootstrap it over the network via regular PXE boot. Full instructions for setting up Windows 7 to netboot are here.

Overall, setting up Windows 7 to netboot was remarkably easy. Unlike Ubuntu, I didn't need to hack any boot scripts -- which is good, because I wouldn't have any clue how to hack Windows boot scripts. I did ran into one major snag in the process, though: The Windows 7 installer couldn't see the iSCSI drive because it did not have the proper network drivers for my hardware. This turned out to be relatively easy to fix once I figured out how:

  • Download the driver from the web and unzip it.
  • Find the directory containing the .inf file and copy it (the whole directory) to a USB stick.
  • Plug the USB stick into the target machine and start the Windows 7 installer.
  • In the installer, press shift+F10 to open the command prompt.
  • Type: drvload C:\path\to\driver.inf

With the network card operational, the iSCSI target appeared as expected. The installer even managed to install the network driver along with the rest of the system. Yay!

Once Windows was installed to the iSCSI target, gPXE could then boot directly into it, without any need for a local disk at all. Yes, this means you can PXE-boot Windows 7 itself, not just the installer.

Unfortunatley, Windows has no built-in copy-on-write overlay support (that I know of). Some proprietary solutions exist, at a steep price. For now, I am instead applying the COW overlay server-side, meaning that writes will actually go back to the server, but each game station will have a separate COW overlay allocated for it on the server. This should be mostly fine since guests don't usually install new games or otherwise write much to the disk. However, I'm also talking to the author of WinVBlock, an open source Windows virtual block device driver, about adding copy-on-write overlay support, so that the local hard drives in all these machines don't go to waste.

Now that the COW overlays are being done entirely server-side, I am able to take full advantage of LVM. For each machine, I am allocating a 20GB LVM snapshot of the master image. The snapshots all live on the 240GB SATA-3 SSD, since the server will need fast access to the tables it uses to manage the COW overlays. (For now, the snapshots are allocated per-machine, but I am toying with the idea of allocating them per-player, so that a player can switch machines more easily (e.g. to balance teams). However, with the Steam Cloud synchronizing most game settings, this may not be worth the effort.)

Normally, LVM snapshots are thought of as a backup mechanism. You allocate a snapshot of a volume, and then you go on modifying the main volume. You can use the snapshot to "go back in time" to the old state of the volume. But LVM also lets you modify the snapshot directly, with the changes only affecting the snapshot and not the main volume. In my case, this latter feature is the critical functionality, as I need all my machines to be able to modify their private snapshots. The fact that I can also modify the master without affecting any of the clones is just a convenience, in case I ever want to install a new game or change configuration mid-party.

I have not yet stress-tested this new setup in an actual LAN party, so I'm not sure yet how well it will perform. However, I did try booting all 12 machines at once, and starting Starcraft 2 on five machines at once. Load times seem fine so far.

Frequently Asked Questions

How do you handle Windows product activation?

I purchased 12 copies of Windows 7 Ultimate OEM System Builder edition, in 3-packs. However, it turns out that because the hardware is identical, Windows does not even realize that it is moving between machines. Windows is tolerant of a certain number of components changing, and apparently this tolerance is just enough that it doesn't care that the MAC address and component serial numbers are different.

Had Windows not been this tolerant, I would have used Microsoft's VAMT tool to manage keys. This tool lets you manage activation for a fleet of machines all at once over the network. Most importantly, it can operate in "proxy activation" mode, in which it talks to Microsoft's activation servers on the machines' behalf. When it does so, it captures the resulting activation certificates. You can save these certificates to a file and re-apply them later, whenever the machines are wiped.

Now that I know about VAMT, I intend to use it for all future Windows activations on any machine. Being able to back up the certificate and re-apply it later is much nicer than having to call Microsoft and explain myself whenever I re-install Windows.

I highly recommend that anyone emulating my setup actually purchase the proper Windows licenses even if your machines are identical. The more machines you have, the more it's actually worth Microsoft's time to track you down if they suspect piracy. You don't want to be caught without licenses.

You might be able to get away with Windows Home Premium, though. I was not able to determine via web searching whether Home Premium supports iSCSI. I decided not to risk it.

UPDATE: At the first actual LAN party on the new Windows 7 setup, some of the machines reported that they needed to be activated. However, Windows provides a 3-day grace period, and my LAN party was only 12 hours. So, I didn't bother activating. Presumably once I wipe these snapshots and re-clone from the master image for the next party, another 3-day grace period will start, and I'll never have to actually activate all 12 machines. But if they do ever demand immediate activation, I have VAMT and 12 keys ready to go.

Do guests have to download their own games from Steam?

No. Steam keeps a single game cache shared among all users of the machine. When someone logs into their account, all of the games that they own and which are installed on the machine are immediately available to play, regardless of who installed them. Games which are installed but not owned by the user will show up in the list with a convenient "buy now" button. Some games will even operate in demo mode.

This has always been one of my favorite things about Steam. The entire "steamapps" folder, where all game data lives, is just a big cache. If you copy a file from one system's "steamapps" to another, Steam will automatically find it, verify its integrity, and use it. If one file out of a game's data is missing, Steam will re-download just that file, not the whole game. It's fault-tolerant software engineering at its finest.

On a similar note, although Starcraft 2 is not available via Steam, an SC2 installation is not user-specific. When you star the game, you log in with your Battle.net account. Party guests thus log in with their own accounts, without needing to install the game for themselves.

Any game that asks for ownership information at install time (or first play) rather than run time simply cannot be played at our parties. Not legally, at least.

Is your electricity bill enormous?

I typically have one LAN party per month. I use about 500-600 kWh per month, for a bill of $70-$80. Doesn't seem so bad to me.

Why didn't you get better chairs!?!

The chairs are great! They are actually pretty well-padded and comfortable. Best of all, they stack, so they don't take much space when not in use.

You can afford all these computers but you have cheap Ikea furniture?

I can afford all these computers because I have cheap Ikea furniture. :)

I had no money left for new furniture after buying the computers, so I brought in the couches and tables from my old apartment.

How can you play modern games when most of them don't support LAN mode?

I have an internet connection. If a game has an online multiplayer mode, it can be used at a LAN party just fine.

While we're on the subject, I'd like to gush about my internet connection. My download bandwidth is a consistent 32Mbit. Doesn't matter what time of day. Doesn't matter how much bandwidth I've used this month. 32Mbit. Period.

My ISP is Sonic.net, an independent ISP in northern California. When I have trouble with Sonic -- which is unusual -- I call them up and immediately get a live person who treats me with respect. They don't use scripts, they use emulators -- the support person is running an emulator mimicking my particular router model so that they can go through the settings with me.

Best of all, I do not pay a cent to the local phone monopoly (AT&T) nor the local cable monopoly (Comcast). Sonic.net provides my phone lines, over which they provide DSL internet service.

Oh yeah. And when I posted about my house the other day, the very first person to +1 it on G+, before the post had hit any news sites, was Dane Jasper, CEO of Sonic.net. Yeah, the CEO of my ISP followed me on G+, before I was internet-famous. He also personally checked whether or not my house could get service, before it was built. If you e-mail him, he'll probably reply. How cool is that?

His take on bandwidth caps / traffic shaping? "Bandwidth management is not used in our network. We upgrade links before congestion occurs."

UPDATE: If you live outside the US, you might be thinking, "Only 32Mbit?". Yes, here in the United States, this is considered very fast. Sad, isn't it?

What's your network infrastructure? Cisco? Juniper?

Sorry, just plain old gigabit Ethernet. I have three 24-port D-Link gigabit switches and a DSL modem provided by my ISP. That's it.

Why didn't you get the i5-2500k? It is ridiculously overclockable.

I'm scared of overclocking. The thought of messing with voltages or running stability tests gives me the shivers. I bow to you and your superior geek cred, oh mighty overclocker.

What do you do for cooling?

I have a 14000 BTU/hr portable air conditioner that is more than able to keep up with the load. I asked my contractor to install an exhaust vent in the wall of the server room leading outside (like you'd use for a clothes dryer), allowing the A/C to exhaust hot air.

My house does not actually have any central air conditioning. Only the server room is cooled. We only get a couple of uncomfortably-hot days a year around here.

Dragging over your own computers is part of the fun of LAN parties. Why build them in?

I know what you mean, having hosted and attended dozens of LAN parties in the past. I intentionally designed the stations such that guests could bring their own system and hook it up to my monitor and peripherals if they'd like. In practice, no one does this. The only time it ever happened is when two of the stations weren't yet wired up to their respective computers, and thus it made sense for a couple people to bust out their laptops. Ever since then, while people commonly bring laptops, they never take them out of their bags. It's just so much more convenient to use my machines.

This is even despite the fact that up until now, my machines have been running Linux, with a host of annoying bugs.

How did you make the cabinetry? Can you provide blueprints?

I designed the game stations in Google Sketchup and then asked a cabinet maker to build them. I just gave him a screenshot and rough dimensions. He built a mock first, and we iterated on it to try to get the measurements right.

I do not have any blueprints, but there's really not much to these beyond what you see in the images. They're just some wood panels with hinges. The desk is 28" high and 21" deep, and each station is 30" wide, but you may prefer different dimensions based on your preferences, the space you have available, and the dimensions of the monitor you intend to use.

The only tricky part is the track mounts for the monitors, which came from ErgoMart. The mount was called "EGT LT V-Slide MPI" on the invoice, and the track was called "EGT LT TRACK-39-104-STD". I'm not sure if I'd necessarily recommend the mount, as it is kind of difficult to reach the knob that you must turn in order to be able to loosen the monitor so that it can be raised or lowered. They are not convenient by any means, and my friends often make me move the monitors because they can't figure it out. But my contractor and I couldn't find anything else that did the job. ErgoMart has some deeper mounts that would probably be easier to manipulate, at the expense of making the cabinets deeper (taking more space), which I didn't want to do.

Note that the vertical separators between the game stations snap out in order to access wiring behind them.

Here is Christina demonstrating how the stations fold out!

What games do you play?

Off the top of my head, recent LAN parties have involved Starcraft 2, Left 4 Dead 2, Team Fortress 2, UT2k4, Altitude, Hoard, GTA2, Alien Swarm, Duke Nukem 3D (yes, the old one), Quake (yes, the original), and Soldat. We like to try new things, so I try to have a few new games available at each party.

What about League of Legends?

We haven't played that because it doesn't work under WINE (unless you manually compile it with a certain patch). I didn't mind this so much as I personally really don't like this game or most DotA-like games. Yes, I've given it a chance (at other people's LAN parties), but it didn't work for me. To each their own, and all that. But now that the machines are running Windows, I expect this game will start getting some play-time, as many of my friends are big fans.

Do you display anything on the monitors when they're not in use?

I'd like to, but haven't worked out how yet. The systems are only on during LAN parties, since I don't want to be burning the electricity or running the A/C 24/7. When a system is not in use during a LAN party, it will be displaying Electric Sheep, a beautiful screensaver. But outside of LAN parties, no.

UPDATE: When I say I "haven't worked out how yet," I mean "I haven't thought about it yet," not "I can't figure out a way to do it." It seems like everyone wants to tell me how to do this. Thanks for the suggestions, guys, but I can figure it out! :)

The style is way too sterile. It looks like a commercial environment. You should have used darker wood / more decoration.

I happen to very much like the style, especially the light-colored wood. To each their own.

How much did all this cost?

I'd rather not get into the cost of the house as a whole, because it's entirely a function of the location. Palo Alto is expensive, whether you are buying or building. I will say that my 1426-square-foot house is relatively small for the area and hence my house is not very expensive relative to the rest of Palo Alto (if it looks big, it's because it is well-designed). The house across the street recently sold for a lot more than I paid to build mine. Despite the "below average" cost, though, I was just barely able to afford it. (See the backstory.)

I will say that the LAN-party-specific modifications cost a total of about $40,000. This includes parts for 12 game machines and one server (including annoyingly-expensive rack-mount cases), 12 keyboards, 12 mice, 12 monitors, 12 35' HDMI cables, 12 32' USB cables, rack-mount hardware, network equipment, network cables, and the custom cabinetry housing the fold-out stations. The last bit was the biggest single chunk: the cabinetry cost about $18,000.

This could all be made a lot cheaper in a number of ways. The cabinetry could be made with lower-grade materials -- particle board instead of solid wood. Or maybe a simpler design could have used less material in the first place. Using generic tower cases on a generic shelf could have saved a good $4k over rack-mounting. I could have had 8 stations instead of 12 -- this would still be great for most games, especially Left 4 Dead. I could have had some of the stations be bring-your-own-computer while others had back-room machines, to reduce the number of machines I needed to buy. I could have used cheaper server hardware -- it really doesn't need to be a Xeon with ECC RAM.

Is that Gabe Newell sitting on the couch?

No, that's my friend Nick. But if Gabe Newell wants to come to a LAN party, he is totally invited!

UPDATE: More questions

Do the 35-foot HDMI and 32-foot USB cables add any latency to the setup?

I suppose, given that electricity propagates through typical wires at about 2/3 the speed of light, that my 67 feet of cabling (round trip) add about 100ns of latency. This is several orders of magnitude away from anything that any human could perceive.

A much larger potential source of latency (that wouldn't be present in a normal setup) is the two hubs between the peripherals and the computer -- the powered 4-port to which the peripherals connect, and the repeater in the extension cable that is effectively a 1-port hub. According to the USB spec (if I understand it correctly), these hubs cannot be adding more than a microsecond of latency, still many orders of magnitude less than what could be perceived by a human.

Both of these are dwarfed the video latency. The monitors have a 2ms response time (in other words, 2000x the latency of the USB hubs). 2ms is considered extremely fast response time for a monitor, though. In fact, it's so fast it doesn't even make sense -- at 60fps, the monitor is only displaying a new frame every 17ms anyway.

Do you use high-end gaming peripherals? SteelSeries? Razer?

Oh god no. Those things are placebos -- the performance differences they advertise are far too small for any human to perceive. I use the cheapest-ass keyboards I could find ($12 Logitech) and the Logitech MX518 mouse. Of course, guests are welcome to bring their own keyboard and mouse and just plug them into the hub.

Why not use thin clients and one beefy server / blades / hypervisor VMs / [insert your favorite datacenter-oriented technology]?

Err. We're trying to run games here. These are extremely resource-hungry pieces of software that require direct access to dedicated graphics hardware. They don't make VM solutions for this sort of scenario, and if they did, you wouldn't be able to find hardware powerful enough to run multiple instances of a modern game on one system. Each player really does need a dedicated, well-equipped machine.

I'll keep adding more questions here as they come up.

Geen opmerkings nie:

Plaas 'n opmerking