UW data science team at AAAS Annual Meeting

There’s a new kind of researcher on campus, one who doesn’t fit into the usual nooks and crannies at a university.

They are data scientists – students, faculty members and professional scientific staff – who are building the tools and crafting the methods to help researchers analyze the vast amounts of data now abundant in every field. The very nature of their skill set is interdisciplinary, and the university system doesn’t always reward them for the time they spend developing techniques and software to advance discovery.

This dilemma, and what universities can do to change it, is the topic of a symposium Feb. 15 at the American Association for the Advancement of Science annual meeting in San Jose, California. The session, “Advancing University Career Paths in Interdisciplinary Data-Intensive Science,” was organized by UW’s Cecilia Aragon and Bill Howe, and also includes UW’s Ed Lazowska, Berkeley’s Joshua Bloom and Fernando Perez, and NYU’s Juliana
Freire – partners in the Data Science Environments project supported by the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation.

Read a UW News post here.  See Ed Lazowska’s introductory slides here.


Blog post by Brittany Fiore-Silfvast

The Data Science Environment (DSE) Summit took place in beautiful Monterey, CA at the Asilomar Conference Center. The Summit brought together over a hundred participants across three universities (UW, UC Berkeley and NYU) involved in the Moore and Sloan Foundations’ Data Science Environment grant.

As a data science ethnographer, I typically take on the role of participant-observer of various data science events, but at the DSE Summit I ended up being more of a participant than an observer. The high degree of participation made it challenging at times to listen as closely as I would have wanted to for underlying rhythms and patterns across the group. However participating in the discussion sessions and interactions I identified some important undercurrents. I draw out these undercurrents into two main themes that I discuss in this post.

image of Monterey coastline

Photo credit: Kevin Koy


from UW CSE News:


SeaFlow, a research instrument developed in the lab of UW School of Oceanography director Ginger Armbrust, analyzes 15,000 marine microorganisms per second, generating up to 15 gigabytes of data every single day of a typical multi-week-long oceanographic research cruise.

UW professor of astronomy Andy Connolly is preparing for the unveiling of the Large Synoptic Survey Telescope (LSST), which will map the entire night sky every three days and produce about 100 petabytes of raw data about our universe over the course of 10 years. (One petabyte of music in MP3 format would take 2,000 years to play.)

What scientists like Armbrust and Connolly have is popularly known as "big data," and as rich and exciting as it can be, big data can also be a big problem.

"Every field of discovery is transitioning from data-poor to data-rich, and the people doing the research don’t have the wherewithal to cope with this data deluge," says Ed Lazowska, director of the UW’s eScience Institute.

“And now, the eScience team – the core team includes faculty from 12 departments representing five schools and colleges – is poised to scale way up. Last year, the UW won a five-year, $37.8 million grant from the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation that will be shared with New York University and the University of California, Berkeley, to foster a data science culture at the three universities.

“We don’t want this to be a magic trick that only computer scientists know how to do,’ [eScience Institute Associate Director Bill] Howe says. ‘It should be something that everybody can do.”

Read more in html or pdf.


from UW CSE News:

The Gordon and Betty Moore Foundation joined last year with the Alfred P. Sloan Foundation in a process that ultimately selected the University of Washington, UC Berkeley, and New York University as partners in a 5-year, $38.7 million collaborative effort to advance data-intensive discovery. 

The Moore Foundation has just announced the results of a subsequent competition to identify leading individual researchers as “Data-Driven Discovery Investigators,” funded at $1.5 million each. From an original field of more than 1,000 pre-proposals, roughly 100 researchers were invited to submit full proposals. 28 of these were invited to participate in a workshop, after which 14 were selected as recipients of $1.5 million Moore Foundation Data-Driven Discovery Investigator Awards – including UW CSE professor Jeff Heer.



The first Astro Hack Week took place from September 15-19, 2014 at University of Washington. We had about 45 attendees through the week. We spent the mornings together learning new coding, statistics, and data analysis skills, and spent the afternoons working in pairs and groups on a wide variety of projects. These projects spanned a range of topics, and comprised everything from short exercises to development of teaching materials to full-blown research projects which will likely lead to publications!

Along with these hacks, the afternoons were also punctuated by informal breakout sessions on everything from using Git to constructing Probabilistic Graphical Models. Thanks to all the participants who stepped up to lead these breakouts and share their expertise with others!

(Photo by Adrian Price-Whelan)



Learn how the University of Washington is expanding data-intensive discovery.
Learn how the structured query language and SQLShare can help your research.

Director's Picks