Jul 02

Usenix 2004: Thursday notes

Google Plenary Session

  • Basic design criteria:

  • Google-scale services will have total machine failure many times a day
  • Fancy machines won’t reduce MTBF enough to avoid needing to write fault-tolerant software; if you have fault-tolerant software you can buy tons of cheap hardware and just go for massive redundancy
  • ~1000 machines will see your query
  • Failures are too often to have human-involvement
  • GFS replicates 3x for security - but may be more for load (heavily used chunks are copied more)
  • Data center cooling failure: “When the fire trucks show up you know there’s a problem”
  • Commodity hardware is the best way to get fault tolerance

    • “Cheap and nasty works” - disks stacked on motherboards w/acrylic, held together by velcro (“but make sure there’s enough cooling”).
    • “The gleaming racks from the boom are all gone”
    • “We treat commodity disks like server disks and pay the price sometimes”
      Anecdote about disks which got so overloaded that the ASICs desoldered off the board
    • Write better software since you need it anyway (the whole discussion keeps coming back to this principle)

      • When you can unplug an entire rack and not lose any data you’re doing it right
      • Q&A

        • Sun engineer asked whether they’re doomed (paraphrasing only slightly) A: If you can fit everything on one or two machines or lack the talent to make non-trivial investment massive fault-tolerant software buy high-quality hardware. The server hardware & software community has largely failed to deliver true fault tolerance. Network, other non-trivial equipment is not “cheap” hardware.
        • Security - does Google have special problems A: No more than any other large site
        • Questions about slopped-together machines A: “It got us through the dot-com days but it has a lot of problems”
        • Disposal costs A: Couldn’t comment in detail (IPO restrictions) but “we can’t afford to give anything away”
        • Return to cheap hardware discussion A: It gets down to how well your problem can be subdivided. Some will be stuck on big-iron forever
        • Scope of the engineering for Google-level fault tolerance A: It’s more a mindset then a single project - “a bunch of smart people worked on it”.
        • Software testing: how does Google test things A: Testing at every level. Extensive real-time monitoring and partial rollout (e.g. <1% of Google’s traffic is a sizable test case)
        • Hardware design A: Yes, the disks are velcroed in
        • Software deployment A: Massive home-brew system
        • Will Google release GFS? A: Tentative internal discussions
        • Has Google every had a massive failure? A: Yes - all the time. (e.g. Data center on fire) Continue to hammer the “get fault tolerance right” message.
        • Can Google release the “Day in Google” map animation? A: Possibly, yes
        • How many 9s reliability? A: IPO - can’t comment
        • Are quoted MTBFs actually true? A: I don’t know but we use disks in unexpected ways
        • FREENIX

          Glitz: Hardware Accelerated Image Compositing Using OpenGL

          Peter Nilsson and David Reveman, Umeå University

          • Part of Cairo (PDF 1.4 vector-based graphic subsystem)
          • Significant performance improvements vs. imlib/xrender/etc. - often order-of-magnitude increases

          High Performance X Servers in the Kdrive Architecture

          Eric Anholt, LinuxFund.org

          How Xlib Is Implemented (and What We’re Doing About It)

          Jamey Sharp, Portland State University

          Sysadmin Guru Session

          • Limiting root access:
          • Political problems, often requires outback networks
          • Least privilege / Kerberos systems
        • Config management: almost everyone does it, usually cfengine
        • Ticketing systems: why not tie a ticket number to every config change?
          • RT very popular - the only system which most users liked
          • Eventum http://dev.mysql.com/downloads/other/eventum/
          • Project management systems
            • Nobody likes them - expensive, bad products for the wrong problem
            • Documentation
              • Wikis are very popular
              • Search is critical
              • Moving FAQs from tickets to docs
              • Working with users
                • Surveys - frequently but make sure they’re open-ended enough. No substitute for simply talking with people.
                • UseLinux: Data

                  Linux Genomics

                  • Post-internet field:
                  • Extensive collaboration - weblogs, forums, public database, etc.
                  • Heavy use of OSS: Linux, MySQL, Perl plus a significant amount of scientific software

                  Thin-client Linux

                  • Business needs for Cardiologist
                  • Easy & fast networking
                  • Shared file acesss
                  • Easy expansion: sending techs out is expensive
                • The case for thin clients
                  • Hardware, support, software, and training costs are all lower
                  • Ongoing costs plummet: specific example in the speaker’s medical practice per-user-cost went from $830/year to $230.
                  • The tools
                    • OpenLDAP, Zabbix monitoring, LTSP
                    • The results
                      • 100 users on xSeries 335, ~20Mbps on 100BT