last modified: 2017-10-08

EMLyon logo corp

 

7 roads to data-driven value creation

Not a closed list, not a recipe!

Rather, these are essential building blocks for a strategy of value creation based on data.

1. PREDICT

prediction

 

Prediction: The ones doing it

1. Predictive churn / default / …​ (banks / telco)

2. Predicting crime predpol

3. Predicting deals tilkee

4. Predictive maintenance cat

Prediction: the hard part

1. Collecting data (cold start problem)

2. Risk missing the long tail, algorithmic discrimination, stereotyping

3. Neglect of novelty

2. SUGGEST

suggestion

 

Suggestion: The ones doing it

1. Amazon’s product recommendation system amazon

2. Google’s “Related searches…” google

3. Retailer’s personalized recommendations auchan

Suggestion: the hard part

1. The cold start problem, managing serendipity (see review: paying version, free version not available) and "filter bubble" effects (review: paying version, free version here).

2. Finding the value proposition which goes beyond the simple “you purchased this, you’ll like that”

3. CURATE

curation

 

Curation: The ones doing it

1. Clarivate Analytics curating metadata from scientific publishing crv logo rgb rev

2. Nielsen and IRI curating and selling retail data nielsen iri

3. ImDB curating and selling movie data imdb

Curation: the hard part

1. Slow progress: curation needs human labor to insure high accuracy, it does not scale the way a computerized process would.

2. Must maintain continuity: missing a single year or month hurts the value of the overall dataset disproportionally.

3. Scaling up / right incentives for the workforce: the workforce doing the curation should be paid fairly, which is not the case yet.

4. Quality control

4. ENRICH

enrich

 

Enrichment: The ones doing it

1. Selling methods and tools to enrich datasets watson

2. Selling aggregated indicators edf

3. Selling credit scores

Enrichment: the hard part

1. Knowing which cocktail of data is valued by the market

2. Limit replicability

3. Establish legitimacy

5. RANK / MATCH / COMPARE

rank

 

Ranking / matching / comparing: The ones doing it

1. Search engines ranking results google

2. Yelp, Tripadvisor, etc… which rank places tripadvisor

3. Any system that needs to filter out best quality entities among a crowd of candidates

Ranking / matching / comparing: the hard part

1. Finding emergent, implicit attributes (imagine: if you rank things based on just one public feature: not interesting nor valuable)

2. Insuring consistency of the ranking (many rankings are less straightforward than they appear)

3. Avoid gaming of the system by the users (for instance, companies try to play Google’s ranking of search results at their advantage)

6. SEGMENT / CLASSIFY

muffin

 

Generating: The ones doing it

1. Tools for discovery / exploratory analysis by segmentation

2. Diagnostic tools (spam or not? buy, hold or sell? healthy or not?) medimsight

Segmenting / classifying: the hard part

1. Evaluating the quality of the comparison

2. Dealing with boundary cases

3. Choosing between a pre-determined number of segments (like in the k-means) or letting the number of segments emerge

7. GENERATE / SYNTHETIZE(experimental!)

generate

 

Generating: The ones doing it

(click on the logos to get to the relevant web page)

1. Intelligent BI aiden

2. wit.ai, the chatbot by FB wit

3. Virtual assistants cx

4. Image generation deepart

5. Close-to-real-life speech synthesis google

Generating: the hard part

1. Should not create a failed product / false expectations

2. Both classic (think of clippy) and frontier science: not sure where it’s going

Combos!

Combinations

 

The end

Find references for this lesson, and other lessons, here.

round portrait mini 150 This course is made by Clement Levallois.

Discover my other courses in data / tech for business: http://www.clementlevallois.net

Or get in touch via Twitter: @seinecle

site
    stats