A startup in Indianapolis Indiana called Uflavor has a process to create an infinite number of soda flavors from a crowd-sourced process. Inc Magazine just named them one of the three most innovative sites around social media.
The plan is for the customer to specify combinations of
- Flavors (apple Pie, Bacon, Clementine, etc)
- Sweeteners — sugars and sugar substitutes
- Sweetness levels (1– 10 in incrementats)
- Carbonation levels
- Acids (Citric, Phosphoric, Malic, Funaric, Tartaric, Ascorbic)
- Extras (Taurine, Caffeine, Ginseng Extract, Electrolytes)
The goal is to create crowdsourced sodas which get voted on by the community in a contest. In the future, the company is working on a soda machine which can deliver custom blends of soda.
From a data perspective, assuming there are log files of the combinations and final sodas along with unit sales numbers, how would you figure out:
- What are the most popular blends of soda? Are they converging in a few areas or not? (eg “tropical flavors”, “sodas with ginseng”, “citrus flavors without caffeine”) What are the natural “clusters”? How do these compare with standard sodas on the market?
- Is there a way to combine the product blend analysis with customer segmentation? For example, athletes may prefer sodas with electrolytes, while young adults may prefer specfic flavor combinations (ginseng with ginger)
- How does this relate to product development /product management? Can these crowdsourced flavors be input into a standard development process and help inform new product development.
My initial thoughts on this that the best way to deal with the data would be to look at standard industry groupings and apply that to the data first, since there is an installed base, but then also to slice the data various ways to find hidden insights into the combinations.
It would look something like:
Caffeinated vs, non caffeinated drinks
Cola vs. Non cola (fruit carbonates)
Diet vs. nonDiet drinks
The trick would be to look at multiple combinations of the categories and then combine with customer segmentation data. An alternative would be to iterate on clusters to see which combinations occur most often and try to derive insights from there.