Crowdsourcing Soda Flavors + Big Data

A startup in Indi­anapo­lis Indi­ana called Ufla­vor has a process to cre­ate an infi­nite num­ber of soda fla­vors from a crowd-​sourced process. Inc Mag­a­zine just named them one of the three most inno­v­a­tive sites around social media.

This presents an inter­est­ing prob­lem in data analy­sis: how to get insights or trends from the large vol­ume of data.

The plan is for the cus­tomer to spec­ify com­bi­na­tions of

  • Fla­vors (apple Pie, Bacon, Clemen­tine, etc)
  • Sweet­en­ers — sug­ars and sugar substitutes
  • Sweet­ness lev­els (1– 10 in incrementats)
  • Car­bon­a­tion levels
  • Acids (Cit­ric, Phos­phoric, Malic, Funaric, Tar­taric, Ascorbic)
  • Extras (Tau­rine, Caf­feine, Gin­seng Extract, Electrolytes)
  • Color
User cre­ated soda

The goal is to cre­ate crowd­sourced sodas which get voted on by the com­mu­nity in a con­test. In the future, the com­pany is work­ing on a soda machine which can deliver cus­tom blends of soda.

From a data per­spec­tive, assum­ing there are log files of the com­bi­na­tions and final sodas along with unit sales num­bers, how would you fig­ure out:

  • What are the most pop­u­lar blends of soda? Are they con­verg­ing in a few areas or not? (eg “trop­i­cal fla­vors”, “sodas with gin­seng”, “cit­rus fla­vors with­out caf­feine”) What are the nat­ural “clus­ters”? How do these com­pare with stan­dard sodas on the market?
  • Is there a way to com­bine the prod­uct blend analy­sis with cus­tomer seg­men­ta­tion? For exam­ple, ath­letes may pre­fer sodas with elec­trolytes, while young adults may pre­fer specfic fla­vor com­bi­na­tions (gin­seng with ginger)
  • How does this relate to prod­uct devel­op­ment /​prod­uct man­age­ment? Can these crowd­sourced fla­vors be input into a stan­dard devel­op­ment process and help inform new prod­uct development.

My ini­tial thoughts on this that the best way to deal with the data would be to look at stan­dard indus­try group­ings and apply that to the data first, since there is an installed base, but then also to slice the data var­i­ous ways to find hid­den insights into the combinations.

It would look some­thing like:

Caf­feinated vs, non caf­feinated drinks

Cola vs. Non cola (fruit carbonates)

Diet vs. nonDiet drinks

The trick would be to look at mul­ti­ple com­bi­na­tions of the cat­e­gories and then com­bine with cus­tomer seg­men­ta­tion data. An alter­na­tive would be to iter­ate on clus­ters to see which com­bi­na­tions occur most often and try to derive insights from there.

Soda machines and big data

No Comments

Leave a Reply

Your email is never shared.Required fields are marked *