Designers as data scientists

Data science isn't only the purview of analysts and statisticians; it should be part of a designer's skill set as well.

Download a free copy of “The New Design Fundamentals” ebook, a curated collection of chapters from our Design library. Note: this post is an excerpt from “Designing with Data,” by Rochelle King and Elizabeth F. Churchill, which is included in the curated collection.

It might feel like using data is big news now, but the truth is that we’ve been using data for a long time. For the past 20 years, we’ve been moving and replicating more and more experiences that we used to have in the physical world into the digital world. Sharing photos, having conversations, duties that we used to perform in our daily work have all become digital. We could probably have a separate discussion as to how much the migration from the physical “real” world to the digital world has benefitted or been detrimental to our society, but you can’t deny that it’s happening and only continues to accelerate at a breakneck pace.

Let’s take a look at what it means for these experiences to be moving from the physical to the digital. Not too long ago, the primary way that you shared photos with someone was that you would have to have used your camera to take a photo at an event. When your roll of film was done, you’d take that film to the local store where you would drop it off for processing. A few days or a week later, you would need to pick up your developed photos, and that would be the first time you’d be able to evaluate how well the photos that you took many days prior actually turned out. Then, maybe when someone was at your house, you’d pull out those photos and narrate what each photo was about. If you were going to really share those photos with someone else, you’d maybe order duplicates and then put them in an envelope to mail to them — and a few days later, your friend would get your photos as well. If you were working at a company like Kodak that had a vested interest in getting people to use your film, processing paper, or cameras, then there are so many steps and parts of the experience that I just described which are completely out of your control. You also have almost no way to collect insight into your customers’ behaviors and actions along the process.

Now, let’s take the same example of sharing a photo in the digital world. Your user will take out their phone and take a photo. They may open up Instagram, apply some filters to the photo and edit it on the spot before adding a caption and then sharing it. They might also choose to share it on different channels, like Twitter or via email. The entire experience of sharing a photo has been collapsed and condensed into one uninterrupted flow and a single screen, one that you can hold in the palm of your hand. And because all of this is digital, data is continuously being collected along the way. You have access to all kinds of information that you wouldn’t have had before. Location, time spent in each step, which filters were tried but not used, what was written about the photo and to whom the photo was sent. You can also gather consumption data on the photo: how many people viewed it or liked it? Not only are you able to gather that information on just one user, but you can gather it for each and every single user. And that data is both precise as well as dynamic — so you could get an instant understanding of how your customers’ behaviors and interactions might be changing and evolving with respect to your product and in reaction to changes you make to your product.

All this data can be really powerful, and because digital interfaces have made data collection so easy, we have to make sure that we don’t fool ourselves into thinking that data interpretation is easier than it actually is. There is a danger that the ease in gathering data also makes it easer to make bigger mistakes with that data. It becomes our responsibility to make sure we use that data responsibly. That we are clear and careful about how we use data. We are just seeing the beginning of this time where data at scale is just as accessible to small startups as it is to well-established large companies, and the future holds much promise for what designers will be able to do with access to all this information.

Commoditization of data

We are seeing that data is quickly becoming commoditized with companies and services springing up to help you with your data. There are so many tools and services available now to help you gather data about your customers that there is almost no excuse to not be leveraging data more in your design and product development cycles. We’re seeing this commoditization of data happening in all aspects.

Companies like usertesting.com and others are making it easier to get qualitative data, even if you don’t have dedicated user research facilities. In fact, you could argue that services like this can be stronger than traditional labs because the virtual nature of the service allows you to gather feedback from customers around the world. Optimizely and other companies are making quantitative data collection easier by allowing companies to run quick and easy A/B tests on their websites as well. We see data boot camps springing up, and there are also companies where you can outsource your data analysis so you don’t need to bear the expense of hiring and keeping data analysts on staff.

Finally, we see more and more places where there are new data degrees emerging.

Quantitative data at scale

We thought it would be fun to include a little bit of history in this chapter, with the idea that by taking a step back — way back — it might help to give our readers some perspective into how the smart use of data has been something we have been doing for a very long time. As designers, we pride ourselves on being excellent at solving problems, seeing how others have used data to illuminate and solve the problems they face in other industries can be quite enlightening.

The 15th century marked the beginning of the Age of Discovery, when Europeans embarked on expeditions to explore the world. However, on especially long trips where they could not store fruits and vegetables, scurvy was a significant problem. As R.E. Hughes chronicles in a report (PDF), in May 1747, aboard the British Navy ship HMS Salisbury, naval surgeon named James Lind conducted an experiment to identify a cure for scurvy. He chose six pairs of seamen suffering from the disease and gave a different “remedy” to each pair, in addition to their normal rations. Five of these pairs of sailors showed no significant improvement, but one pair who had been prescribed oranges and lemons quickly showed signs of recovering from the disease. Hughes quotes Lind’s account of the experiment:

On the 20th of May, 1747, I took 12 patients in the scurvy, on board the Salisbury at sea. Their cases were as similar as I could have them. They all, in general, had putrid gums, the spots and lassitude, with weakness of the knees. They lay together in one place, being a proper apartment for the sick in the fore-hold and had one diet common to all. … two of these were ordered each a quarter of cyder a day. Two others took 25 gutts of elixir vitriol three times a day. … Two others took two spoonfuls of vinegar three times a day. … Two of the worst patients were put under a course of sea water. Two others each had two oranges and one lemon given them every day. The consequence was, that the most sudden and visible good effects were perceived from the use of the oranges and lemons.

By this simple experiment, Lind managed to demonstrate that oranges and lemons were a more effective cure for scurvy than any of the other known remedies. Eventually, the Navy started giving citrus fruit to all sailors on long voyages to protect against the disease.

In another historical example, Dr. John Snow was working with the Reverend Henry Whitehead in the midst of the 19th century cholera outbreak in London. As described by Steven Johnson in his book The Ghost Map, the two men worked together to take a truly multidisciplinary approach. Dr. Snow’s scientific understanding behind the transmission of the disease was powerfully coupled with Reverend Whitehead’s human understanding of the local community and their behaviors to help uncover the source and ultimately stop the spread of the disease.

The link between British scientists and how we use data in design today might not seem immediately obvious, but Lind is credited with not only conducting the first ever clinical trial, but the first controlled experiment on multiple groups, where all factors remained the same aside from a single variable. As for Dr. Snow and Reverend Whitehead, you can see here how qualitative data was gathered and used to both provide more insight into uncovering what was going on in the communities and how that qualitative data collected at scale was able to bring insight and clarity to a phenomenon that was initially confounding.

The history of the modern “data scientist” is much more recent. It was in the 1960s that statisticians like John Tukey started to think more about what it was to bring a scientific approach to data analysis. In the 1970s, we see these analysts start to recognize that the power they can bring to data is to fill it with insight and more information. As we jump ahead to today, there is no question that data science is a term that now captures all the work that is done to capture, measure, and to interpret the vast amount of data that represents our users on a daily basis.

We now take it for granted that in medicine, new treatments will be fully researched and rigorously evaluated against other options before being adopted. We expect the same level of rigor in the design and engineering of safety-critical systems like aircrafts, automobiles, or nuclear power stations. But in the design of consumer-facing and recreational software and websites, where human lives are rarely at stake, the pursuit of the best possible designs through a similar approach of testing various options using a scientifically structured methodology is a relatively new phenomenon. Based on the sheer volume of articles, publications, and talks we see generated about data and business, data and technology, and data and marketing, it’s clear that data is a very hot topic and the currency of the day. What we want to do with this book is to take the term “data science” and expand it even further, to move beyond the realm of people who consider themselves statisticians and to something that designers will start to embrace as part of their skill set as well.

tags: , , ,