Big Data for the Rest of Us

The following was first posted on Harvard Business Review.

9:31 AM Tuesday May 29, 2012
by Chris Taylor

The hype around Big Data is growing to deafening proportions, fueled by the prospect that tools now exist that can let small businesses reap the benefits that companies like Google, Amazon, and Facebook so obviously enjoy from mining vast quantities of all sorts of data.

But is that so? The answer is well, yes, kind of — though probably not as simply and easily as many vendors might like you to think. Small businesses are now testing the waters, and their early experience is already shedding light on what challenges the rest of us need to consider before taking the plunge.

One such is VinoEno, a San Francisco-based wine-recommendation start-up founded by Kevin Bersofsky. The wine industry, Bersofsky says, has been ruled by small data for the longest time. People have had very little to go on to work out whether a bottle of wine is good or not — really just the opinion of a handful of wine reviewers, whose descriptions may or may not be tacked up on the racks in the local liquor store. “You can’t get much smaller — or flawed — than one person giving a score, the Wine Spectator model, that drives an entire industry.” Such limited data, Bersofsky knew, made people very uncomfortable spending even $20 on a bottle.

Bersofsky wants to give consumers a better way to decide what wine to buy — a recommendation engine that can match people’s various tastes to the myriad attributes of various wines. He’s envisioned that he could market such a personalized wine-recommendation engine to wine retailers, to be placed in a kiosk in a store or accessed from an iPad, a personal mobile device, or as a plug-in from the retailer’s Web site.

So the VinoEno staffers have set out to build a system to collect and combine wine sensory attribute data and consumer preferences data, to determine a consumer-specific recommendation. Ultimately, they could see, the project would potentially require the collection of massive amounts of data, much iterative work to develop effective recommendation algorithms, some way to validate consumer preferences, and a lot of experimentation to develop a rewarding end-user experience. A big challenge, and they soon found out that they needed help.

  • The first challenge was to find talented people who know Big Data and analytics. Google might be able to attract armies of top-flight data analytics people for large-scale number crunching, but chances are that a small business is not going to be able to build its own analytics solution, at least not without help from a data analytics vendor. After unsuccessfully trying to hire in a very talent-constrained market, VinoEno quickly turned to an outside provider, Fabless Labs. In selecting Fabless, VinoEno was looking for an experienced partner that was not too set in its ways — one that was willing to experiment with solutions more suitable for small operations and not be unduly influenced by approaches that had worked for its deeper-pocketed clients.
  • The second challenge was to decide what tools to use. Here, of course, VinoEno’s people were clueless, depending on Fabless Labs to know which big-data collection and analysis tools would be powerful enough to handle the volume, velocity, and variety of data they’d be working with but also simple enough for them, as non-techy business users, to employ and maintain on their own. “We needed to be able to create and use data that didn’t exist, which was both exciting and scary,” says Bersofsky. VinoEno also depended on its vendor to teach its small, non-technical staff how to handle all that data — about how to implement a cloud strategy, how to move data efficiently, how many data points would make a mathematical model work, data cleanliness requirements, and how to test market the concept.
  • The third challenge was to decide what types of information matter. What kind of information is worth the cost of collecting it? Should VinoEno be trying to match customers to various attributes of wine? Should it be trying to keep track of which groups of people buy what kinds of wine? Should it include Wine Spectator information: Even though it was the competition, could it be dismissed? In such uncharted territory, the questions simply multiplied. “To be truthful,” Bersofsky says, “we’re still working out what will ultimately solve our problem, and only trial and error will tell us. With online content, you can watch 30 movies per month and build data points quickly. With wine, the work has never been done before.”
  • The fourth challenge was to remain open-minded. As VinoEno’s staffers developed their application, they had to learn to avoid thinking they already knew the answers. As an early example, they had always believed that people’s flavor preferences would correlate with a wine’s many attributes. But the analytical results suggested that only a far smaller set of attributes matter (and the lack of negative attributes like astringency or burning). It was hard to accept that so many traditional attributes like oakiness or fruitness really don’t figure into people’s buying decisions at all. But if they really were going to learn something new, they simply had to let go of old ideas. After all, Bersofsky points out, “How many times has someone found a radical conclusion on the way to looking for something else? When 3M invented the Post-It Note, they were looking for something entirely different. If we stay glued to our conviction, we’ll miss other sign posts along the road.”
  • The fifth challenge was to spot the finish line. VinoEno’s founders needed to define what overall success for the recommendation app really meant. This turned out to be more of an art than a science, a combination of trial and error and gut feel for what a good recommendation would eventually be for an individual consumer. “We had no way to validate the result and no one to confirm the recommendation,” Bersofsky explains. “We decided the secret to making the answer valid was to promote the result as the best possible answer available based on the trial data, especially to the sensory scientists on the team who struggled to buy in.”

Wine consumers will be pleased to know that the first generation of the engine is now complete. With Fabless Labs’ help, VinoEno got it up in just three months and for far less than the estimated $250,000 it would have cost to hire a dedicated team. VinoEno is currently test-marketing VinSpin, the first iteration of the engine, and the proof will be in the pudding — in this case, the consumers’ perception of the value of a recommendation.

That’s one company’s story. I’d be interested to hear from others. Have you had a different experience with big data? What would be your recommendations to those just starting down this path?

When the answer is Small Data

Big Data is advertised as the secret to unlocking actionable intelligence. Collecting and sifting through vast amounts of data finds the patterns that change everything. But is elusive ‘data in combination’ the answer that we should expect from analytics? Not necessarily.

More and more often, crunching large amounts of data gets to the opposite result: The answer to many questions is found in far less data than expected. Looking at what’s being answered by large-scale analytics today, the patterns that are emerging often show surprising results like:

  • A clothing retailer discovers that fit matters more than color, or vice versa
  • A wine recommendation engine proves that color matters more than most other attributes, but only when a customer is an occasional wine drinker
  • Only the three most recent transactions show a customer’s preferences and not their composite shopping history

Does that mean that Big Data itself is an overreaching goal for organizations? No. To understand that few factors matter, large data sets need to be created and analyzed. A Small Data answer still requires validation through data that often has velocity, volume and variety. Knowing for sure that Small Data is the answer is just as tricky, and maybe more so.

If we’re not careful, assuming complexity can blind us to the fact that simplicity is the real answer.

The key to today’s Big Data capabilities is to have an open mind and be ready for the answer that you don’t expect.

The fun of not knowing the answer

The best part of the current startup landscape is that we have no way of knowing what will and won’t work. In fact, the situation is the same for established organizations. Between social, mobile, cloud and an Internet that now reaches billions of people, there is enormous change on the horizon.

We know from recent history that seemingly crazy ideas will break through and what seems like a safe bet will go nowhere. That’s the beauty and terror of the rapid changes we’re seeing.

Given this uncertainly, how does a small startup go from ‘nowhere’ to ‘now here’? (Love Guru reference for non-movie-buffs) How does an established company shift to meet a changing world?

Stay nimble

The first idea can often be just the precursor to the breakthrough. Look no further than Flickr, which set out to create a way to photo share as part of gaming. What they stumbled upon with photo sharing dwarfed the original plan in both creativity and financial value. What matters most about this story is that the founders were willing to see the market for their ‘accidental product’ and change gears and course.

Nimble companies change direction when the cues dictate.

Fail fast, fail cheaply

The ability to get to a great idea can require several attempts at products or services that may not work out. There are countless stories of inventors who found success on their 10,000th attempt, but that’s not the point. Get ideas out quickly and as painlessly as possible so that the good one comes to the surface sooner. The longer an idea takes to develop, the more costly and higher risk it becomes. We cherish the things that have taken our biggest investment, our ‘babies’, which can easily blind us to whether that investment was a good idea or not.

While on the topic…reward those who fail fast and don’t punish willingness to try out an idea. You’d be getting rid of your innovators.

Focus on the important things

What matters most is that the idea has market value and that you have the people to realize the vision. To that end, build a smart, creative team and avoid turnover. The longer you work to solve a problem together, the better you’ll get at it. The team will become experts at moving an idea from inception to market and will get faster and better each time.

Unless you’re one of the few who has unlimited funding (and therefore, time) and a first, perfectly conceived idea, your moves will need to follow these patterns to be successful.

Sure, there’s lots more advice about how to create or change your business. I would argue that this is the core of the problem…this is the hard stuff.

A great time to be a very small startup

As an independent intellectual property attorney, I get to work with the best Silicon Valley startup companies as they launch new ideas. It gives me a remarkable view into the next generation marketplace and the technology that supports it. I couldn’t imagine a better use of my skills.

I don’t think there’s ever been a better time to start a technology business. Today’s tools make everything from branding to email a simple affair and keep overhead low and nearly free. No fancy address is required to instill investor confidence as we’ve accepted that some of the greatest ideas can start anywhere, like in a college dorm. Working out of a residence is no stigma in today’s startup scene.

Beyond setting up shop, some of the best new tools are open source and don’t require contracts or royalties to various software vendors for embedded functionality. Hadoop is one of the best examples of a hot product that doesn’t have any contracts associated with it.

Beyond extremely low overhead, it has been years since investors were so willing to fund new, unproven ideas. There is a strong sense that we’re on the verge of another technology breakout that will be look back on as a watershed moment in technology. What a great time to be in this business.