john wilkes: work

Biography

You can find a brief biography and head shots here.

Google

I joined Google in November 2008. I'm based in Sunnyvale, CA.

Most of my initial work was on the Borg cluster management system that allocates resources to Google's internal compute jobs. Here's a couple of external articles about that system; you can find out much more on my publications page:

Inside the Hacker Mind: John Wilkes on Google Omega. Cade Metz, Wired. April 2013.
Return of the Borg: How Twitter Rebuilt Google’s Secret Weapon. Cade Metz, Wired. March 2013.

In October 2016, I joined the Resource And Infrastructure Information Technology (RIOT) team (part of the Unified Fulfillment Organization, or UFO) that automates the fulfillment and provisioning of computing resources to all of Google: we're on the hook to keep up with Google's exponential growth in demand, as efficiently as possible. At our scale, that's quite a big deal. Here's a short video about our work.

After a few years, I found myself spending most of my time working on delivering network capacity to Google's datacenters, so I moved over to the NetInfra team in 2019, founding and leading the cross-organization effort we call Network infrastructure Capacity Delivery (NiCaD).

I'm also the tech lead for the tool we use to match most technical intern candidates to their hosts. (It's what we call a "20% project" - a side activity that, in this case, supports a business-critical process. Very Google!)

Hewlett-Packard

I was at HP Labs for nearly 26 years. Here's a short summary of what I got up to there.

Self-managing systems

Just before I left HP, I helped initiate and found Open Cirrus, an open cloud-computing research testbed designed to support research into the design, provisioning, and management of services at a global, multi-datacenter scale. The original slogan: "PlanetLab for data centers".

Storage systems

The topic area that started my interest in self-managing systems was storage. I founded and led the HP Labs Storage Systems Program for several years.

Together, we worked on the auomatic design and configuration of enterprise-scale storage systems, with an emphasis on storage-area networks (SANs) and disk arrays. Our goal was to make such systems much easier to manage.

Accomplishing this required work in:

Capturing high-level goals for the storage system, expressed as Quality-of-service based Service Level Agreements (QoS-SLAs)
Automatically determining the appropriate design and configuration of a storage system to meet those goals, and the placement of data in it. This includes tools to design network topologies for high speed Storage-Area Networks (SANs) such as FibreChannel.
Online system management tools that allow data to be moved around on the fly, while it's being accessed, and help to close the loop, allowing a storage system to be completely self-managing.
Mechanisms and languages for describing storage system requirements, capabilities, designs, and deployments.
Scalable storage system architectures that can deliver big-system reliability and performance with small-system price and flexibility.

A retrospective on this work was published in Operating Systems Review, January 2009.

Storage, data, and information systems

In 2007-2008 Christopher Hoover, Beth Keer, Pankaj Mehra, Alistair Veitch and I wrote a book on the storage stack, from hardware up to modern information systems. I was the editor and publisher). It grew out of a technology briefing for the HP board of directors. A couple of thousand copies were given away at HP storage conferences, and to students in a CMU class. For about $15, you can buy a copy from Amazon.com.

Earlier work

Before this, I was active in several areas, including:

Helping HP's Storage Systems Division in Boise develop the HP AutoRAID technology.
Network architecture (the Hamlyn sender-based message model).
Operating systems (e.g., the Brevix project).

Many of the papers published by myself and other members of the research groups in which I have worked over the years are available online via the wayback machine archive, together with some of the software and disk I/O traces that we and other researchers used.

Publications

You can find my own publications here. Several papers on related topics can be found at the HP Labs storage systems web site.

I'm not a mathematician, but my Erdős number is 3, according to AMS's collaboration-distance tool: Eric Anderson → Michael Saks → Paul Erdős.

Patents

I am named as an inventor on 50+ issued US patents. The full list can be found here.

External activities

I've been a Program (Co)Chair for SOSP, EuroSys, FAST.
I've been a program committeee member for SOSP, SoCC, FAST, EuroSys, OSDI, NSDI, and a bunch of workshops, including HotOS, HotCloud, HotStorage.

Awards and recognition

Principal Software Engineer, Google, 2010.
HP Fellow, 2002.
ACM Fellow, 2002 (member since 1981).
SNIA Outstanding contribution award for work on the SNIA shared storage model, October 2001.
PhD, University of Cambridge, 1984: Workstation design for distributed computing.
British Computer Society Wilkes Award for the best paper published in 1984 in the Computer Journal by an author or authors aged under 30. The award is named for Maurice Wilkes, director of the Computer Laboratory in Cambridge c. 1947–1980.
The paper was: The Rainbow workstation [PDF 5.3MB]. A. J. Wilkes, D. W. Singer, J. J. Gibbons, T. R. King, P. Robinson, and N. E. Wiseman. Computer Journal 27(2):112-120, May 1984. DOI: 10.1093/comjnl/27.2.112. © 1982 British Computer Society.
The Rainbow workstation itself (my PhD thesis research) was given the British Computer Society Technology Award for 1982.
Diploma in Computer Science, University of Cambridge, 1979.
BA in Natural Sciences, University of Cambridge, 1978.

john wilkes