00:00:08 The hard truth about data science projects failing to impact daily business operations.
00:00:55 The general understanding of data science in most companies as glorified statistical modeling.
00:02:17 The difference between classical statistical modeling and what companies like Google, Microsoft and Amazon are doing.
00:03:51 The history of electromagnetism and the comparison to data science.
00:07:04 The problem with the Kaggle mindset in data science.
00:08:01 Discussion on how technology innovation works in the real world.
00:09:01 Example of how Amazon revolutionized e-commerce.
00:10:01 Explanation that the majority of data science departments are stuck in academic ways.
00:12:34 Explanation that the biggest risk for companies is becoming obsolete.
00:14:06 Discussion on the future importance of AI and deep learning in companies.
00:16:01 The progress of autonomous vehicles is stunning but the question is if it can be industrialized.
00:16:17 The discussion shifts to the application of technology in supply chains and finance.
00:17:37 A litmus test to determine if a company’s data science team is significant enough to the company’s survival.
00:20:07 The importance of executives having “mechanical sympathy” in order to effectively use data science in their company.
00:22:34 The idea of hiring young people fresh out of college to revolutionize a company is wishful thinking and intellectually lazy.
In the interview with Kieran Chandler, Joannes Vermorel discusses how most companies don’t harness the full potential of data science, focusing on existing statistical models rather than developing new ones. He believes companies should reinvent processes and adopt innovative statistical methods. Vermorel highlights the importance of embracing technical capabilities and incorporating them at the executive level through “mechanical sympathy.” He argues that hiring young engineers is insufficient for company transformation and that executives must have mechanical sympathy to truly harness new technology. Successful entrepreneurs are either very young or experienced, with the latter often gained through founding new companies. Relying on fresh graduates for company transformation is wishful thinking and intellectual laziness.
In this interview, Kieran Chandler speaks with Joannes Vermorel, the founder of Lokad, about the shortcomings of classical statistical modeling in data science and how companies need to rethink their approach to succeed in this field.
Vermorel explains that most companies view data science as glorified statistical modeling, applying existing statistical models to data. However, companies that are truly serious about data science, such as Google, Microsoft, and Amazon, are rethinking the nature of statistical methods and inventing new ones, rather than merely applying existing models.
He argues that the way data science is currently envisioned in most companies is too simplistic and compares it to the introduction of electromagnetism in the 19th century. At that time, people only saw it as a useful tool for specific tasks, like quality assurance. However, once electricity was harnessed, it transformed entire industries. In the same way, Vermorel believes that data science has the potential to revolutionize businesses, but only if companies approach it differently.
He critiques the common approach to data science, which involves collecting and cleaning data, applying statistical models, and providing results. He argues that this method, often represented by the Kaggle community, is overly simplistic and focuses too much on finding the best model for a given dataset and problem, rather than truly understanding the underlying data and its potential applications.
Instead, Vermorel suggests that companies should look at data science as a way to reinvent their processes, similar to how electricity revolutionized industry. This requires a shift in mindset from simply applying existing models to data to actually developing new statistical methods and approaches that can unlock the full potential of data science.
Joannes Vermorel argues that most companies are not harnessing the full potential of data science, as they focus on applying existing statistical models rather than developing new ones. To truly benefit from data science, companies must rethink their approach and focus on reinventing their processes and developing innovative statistical methods.
Vermorel argues that the adoption of new technology is not a linear process where companies simply learn from universities and implement innovations. Instead, businesses must engage in an ongoing discussion with new technologies, gaining insights and projecting their future needs based on the capabilities these technologies offer.
Vermorel uses the example of Amazon’s development of e-commerce, highlighting how the company had to rethink the future of commerce, establish requirements, and come up with innovative solutions. He emphasizes that the key to successful innovation is to clearly understand the problem and ask the right questions.
When asked why many data science departments still follow academic approaches, Vermorel cites laziness as a primary factor. Companies often opt for a “buzzwordish” approach, investing in the latest trends without considering how these technologies might fundamentally change their business models. Vermorel suggests that businesses should focus on understanding the profound changes that new technology can bring to their organizations, rather than seeking cosmetic improvements.
Chandler questions whether the risk of adopting new technologies and changing business models might be the reason for resistance to change. Vermorel acknowledges the inherent risk, but also points out the risk of obsolescence for companies that fail to innovate. He shares his experiences from a decade ago, warning retailers about Amazon’s disruptive potential. Despite initially dismissing Amazon as insignificant, many of these retailers have now found themselves struggling to compete with the e-commerce giant.
According to Vermorel, Amazon and Alibaba’s continued growth despite the diseconomies of scale they face indicates that they are light years ahead of their competition. While executives might be tempted to maintain the status quo to minimize risk, Vermorel cautions that doing so continuously can lead to a company’s downfall. Instead, businesses should actively engage with new technologies and adapt their models to remain competitive in the face of disruption.
They discuss the role of advanced statistical techniques, the future of supply chain management, and how companies can instill a data-driven culture in their teams.
Vermorel first addresses the mindset of companies that dismiss the importance of advanced statistical techniques. He argues that companies should not overlook the risks of not investing in new technologies, using the example of businesses that did not adapt to the internet in the early ’90s. He emphasizes that companies must take the risk of adopting new technologies or risk becoming irrelevant.
When asked about the future relevance of AI and deep learning, Vermorel points to the stunning achievements of companies like Google, Amazon, and Microsoft in fields such as chess and autonomous vehicles. He believes that these technologies will continue to advance and play an essential role in supply chain management. He draws a parallel between supply chain optimization and quantitative trading in finance, where the latter has already been in place for decades.
To instill a culture of innovation and risk-taking in data science teams, Vermorel suggests a litmus test: if a company were to fire all its data scientists overnight, would the company be in mortal danger or face bankruptcy within a year? If the answer is no, then the company is likely not taking enough risks with its data science team. He likens this to the early days of the internet, where companies took significant risks with web developers, despite the technology being seemingly inferior to traditional methods at the time. This risk-taking allowed those companies to adapt and thrive in the digital age.
They discussed the importance of companies embracing technical capabilities and having a strategy that incorporates these aspects at the executive level. He emphasizes the concept of “mechanical sympathy,” where executives have a deep understanding of the technical elements, enabling them to make educated decisions in collaboration with engineers.
Vermorel argues that hiring young, brilliant engineers is not enough to revolutionize a company. Rather, it is crucial for executives to have mechanical sympathy to truly harness the potential of new technology. He highlights the fact that the common pattern for successful entrepreneurs is either being very young or having significant experience and wisdom, often gained through establishing new companies. Vermorel concludes that relying on fresh college graduates to transform a company is wishful thinking and intellectual laziness.
Kieran Chandler: Today on LokadTV, we’re going to look past classical statistical modeling and discuss why data science departments must do more than just collect data, manipulate it, and provide results. So Joannes, most of us are still kind of getting to grips with classical statistical modeling. What’s the key idea today?
Joannes Vermorel: There are several ideas. First, I’m using the term “statistical modeling” as I suggested for this episode because, when you look at the data science practices as they are envisioned in most companies, it’s just glorified statistical modeling. For the general audience, in case you’re wondering, when it comes to extracting patterns or replicating some aspect of human intelligence, all we have right now is statistics. These statistical methods can have fancy names like deep learning, and some people will refer to them as AI, but literally, what we have is statistical models.
In the past, a few decades ago, there were attempts to do AI with non-statistical approaches, like the symbolic approach, which yielded almost zero practical results. This branch died out, leaving the statistical approach as the only one that still exists to any meaningful degree at present time. So all we have to do anything fancy with data is statistical methods.
The interesting thing is, what are those teams doing in most companies that are actually doing data science? Well, they are playing with statistical models. I’m opposing that to what, let’s say, people who are really serious about it, like Google, Microsoft, and Amazon, are doing. They are not just doing statistical modeling; they are rethinking the core nature of the next statistical method, like inventing deep learning versus playing with deep learning. It’s about inventing the next stage of gradient boosted trees as opposed to taking gradient boosted trees, one statistical model, and just applying it to another set of data.
Kieran Chandler: So the common understanding is that these teams are going in, they’re collecting data, they’re cleaning it, they’re applying their statistical models, and they’re providing results. What is it that the Googles and Amazons are doing that’s so much better than that?
Joannes Vermorel: I think to answer this question, we have to go back a bit in time and rethink a bit about, let’s say, the 19th century, when electromagnetism was this brand new thing.
Kieran Chandler: If you read a bit of history, how people were approaching that, you would say, “Oh, there’s electromagnetism. It’s so incredibly interesting.” Imagine you’re an industrial company of the time, a pre-industrial company. You have a manufacturer that is semi-manual, and then you have this electromagnetism thing that is becoming of interest, and you say, “Oh, I think it’s really worth having a small team doing some fancy things with that.” And maybe, for quality assurance, we will have a few things where we can test conductivity, and that’s very promising. And we will do a few things that are of interest because, yes, testing conductivity is a great way to do quality assurance on a few things that we produce.
Joannes Vermorel: If you rethink about that a century later, you would think this is nonsense. I mean, with electricity, you can have engines, electrical lighting, heating, cooling, and even melt metal. It will replace all the needs for having an open flame in your manufacturer. So, once you have electricity at scale, you can literally reinvent pretty much everything in the way you’re doing it. The thing is, the way data science is envisioned, it’s something that is a fancy gadget, and people are abiding by the frame of the way the problem presents itself.
Kieran Chandler: So, the way it’s done in academia is to present the problem like you have a well-defined dataset, a series of models, and accuracy metrics of some kind because you want to do a prediction of some kind. And we are going to explore the various models and look for the model that performs best. And that’s literally embodied in the Kaggle competitions that we discussed previously.
Joannes Vermorel: People say, “Well, everything has been framed; there’s a given dataset, a problem with a given metric, and then there’s an endless collection of models.” Most of them can even be composed in many ways, and you can potentially re-engineer the features that you have to make the dataset nicer with respect to the model that you have. That’s data science.
Kieran Chandler: What is so wrong about that kind of Kaggle mindset? Is it just overly simplistic, and it’s just telling you that you’re just going to have one result that’s going to be accurate, or why is it so wrong?
Joannes Vermorel: If you look at the history, and again, it’s something that you only see once in the past, it becomes obvious. But at the present time, when you’re at the fringe, it’s difficult to see. Once it has become the norm, it’s so obvious. So now, if I go back to data science, I see companies invariably taking pet projects, a pet application, sort of fancy numerical gadget where you’ve plugged a bit of AI. Probably the thing that is the most absurdly useless are the chatbots. But I know that there are a lot of people who have raised a lot of money to develop great companies who can dynamically learn how to discuss on Twitter and become Nazis in 48 hours just because people fed the right inputs. And that’s just nonsense.
Kieran Chandler: Back to that, I mean, the problem is that when you want to think about what a new technology entails, it’s not a one-way process. People think that universities discover things, they teach them to students, and students who are now graduates with engineering degrees, computer science degrees, etc., go into companies and roll out the innovation. But it’s absolutely not the way the real world works.
Joannes Vermorel: The real world is, for example, Amazon pretty much invented what was about to become e-commerce, along with a few others like eBay and other pioneers. They defined that through their innovation. It wasn’t Jeff Bezos going somewhere in the early 90s saying, “I’m just going to hire webmasters and build a website.” They literally had to think about what the future of commerce at a distance would be, and that was something profoundly different from what had been done so far.
So, data science, you see, the problem is that you think it’s about having a framework where you can have smart engineers that roll out the solution. Well, actually, the process is much more of an ongoing discussion where the business gains insight into some new technical capabilities and then projects themselves into what should be the business of their future. They then establish the requirements of what is missing tech-wise to actually implement that, and usually, you realize that most of the innovation comes from the requirements. Once you have a clear view of what you need, the solution is not that hard to come up with once you know which question to ask.
Kieran Chandler: And why would you say that the majority of data science departments are still kind of stuck in those academic ways? I mean, why are not more of them like the Amazons of the world?
Joannes Vermorel: First, because if you’re a company of a certain size and you see these buzzwords, the lazy way to be at it is to add a line in their budget where they’re going to spend a few million a year on the buzzwordish approach of the day. If it’s a data science team, yes, let’s do that. If blockchain is a thing, yeah, let’s have a blockchain team as well. They will routinely phase in something that is the buzzword of the day and phase out something that went out of hype. It’s just business as usual.
My message is that if you want to do anything that has any substance, you really need to ask yourself, what would be the profound change in your company due to those fancy numerical methods? If the only change is a cosmetic thing, such as auto-computing the ABC classes in a better way, it’s not going to change anything for your company. But surely, changing the entire business model of a company to adapt to some kind of new technology is a different story.
Kieran Chandler: Is it inherently risky, and would you not say that’s the reason there’s so much resistance to that sort of change? Is it because fundamentally, if people don’t understand it, there’s a lot of risks associated with it because it’s a new technology? So it’s much safer to just leave it as it is and invest maybe in a little bit of research and development?
Joannes Vermorel: The problem is the ambient risk of going out of business just because you’ve become obsolete. For most companies, this risk is quite real. I’ve been discussing with many retailers for over a decade, and literally, ten years ago, I had surreal discussions where I was telling them that this company, Amazon, is coming, and they will eat your lunch. You need to do something. And people were telling me, “Oh, but look, it’s so small. Yes, it’s growing, but it’s not even one percent market share. We don’t care.” Nowadays, if you look at Amazon and Alibaba, they’re absolutely gigantic and still growing, which is insane. Considering the amount of these economies of scale that they’re suffering, you know when you grow past a certain size, you don’t get economies of scale anymore, you get diseconomies of scale. Amazon is way past their stage of diseconomies of scale, so they have massive handicaps for anything that they do because they have such diseconomies of scale. If they want to even still have one percent of growth, they have to overcome absolutely massive diseconomies of scale. That means they are not doing something that is marginally better than most other companies; they are just light years ahead. So, indeed, as an executive, not doing anything or never challenging the status quo is the safest card you can play, no question about that. The problem is that if you play this card continuously for a couple of decades, well, the company just gets passed. My message would be for the CEO: Can you tolerate that this mindset is prevalent in your company? Don’t you see any problem looming ahead? If you don’t, well, I would say nobody will shed tears on the fact that you’ve gone the way of the dodo.
Kieran Chandler: So, these advanced statistical techniques are very much the flavor of the month at the moment, and people are always talking about buzzwords like AI and deep learning. But how much confidence can you have that in ten years’ time, this will still be a topic of importance to these companies?
Joannes Vermorel: That’s a good question. First, the achievements of companies like Google, Amazon, or Microsoft with these technologies are just stunning. For example, we went from programs that had a really hard time outperforming a chess grandmaster to something that is now beating the chess champion. If you look at the latest work of Google, you can go from zero to impossibly, inhumanly good in four hours for chess. They can set up a program that will learn to play chess and, in four hours, reach a point that is literally inhuman, where it will beat every single human alive. The way the computer plays the game is just incomprehensible because it doesn’t even remotely make sense. So, you have stunning achievements, obviously, for very narrow, well-defined problems. But if you look at autonomous vehicles, it’s absolutely stunning as well; it does work.
Kieran Chandler: The question is, can it be industrialized? We still have some hesitations, but the amount of progress is literally stunning.
Joannes Vermorel: The fact that you can take this leap of faith and say, well, there are these technologies that are proven to work on a long series of cases. It’s a very reasonable belief to have, to say that the odds are pretty high that these things will completely reinvent the way supply chains are done. By the way, in the case of supply chain, it’s not even going to be very new because if you look at Lokad, we advocate this vision of quantitative supply chain. But if you look at what is being done in banks, in finance nowadays, it’s all about quantitative trading. They have quants, and it’s not even the future, it’s already there, it has been there for two decades or more. So now, the idea is that this approach is coming two or three decades late to supply chain, but it has already been in place for decades in finance.
Kieran Chandler: If someone’s watching this and they’re wondering what they need to do to instill that kind of culture in their data science teams, what are the steps they should take? How should they push their teams to take more risk and be a bit more groundbreaking and move against the easy status quo options?
Joannes Vermorel: I think you can have a simple litmus test: if the company were to fire all the data scientists overnight, would the company be in desperate trouble? Would the company go bankrupt in one year? And if the answer is no, then probably those people are inconsequential, and whatever you’re doing with those people doesn’t matter. You would say, “Oh, but it’s so much risk, it’s so brutal.” But again, reconsider the situation of Amazon in the early ’90s with the web. People would say, “Oh, but we have such dependency on these web engineers. It’s insane to be dependent on this technology that looks super crappy. You have this modem that is super slow, it takes three minutes to establish the connection. This is a pile of crap on top of a pile of crap. And then you have images that have a resolution where it’s so bad. I mean, first, you wait one minute to have the image of your product, and then it’s super crappy. This paper catalog is so much better. Real-time access to all the products, high-resolution pictures. This web thing is just a big pile of nonsense. Why would we take any risk on having a hard dependency company-wide on these web developers?” Well, the answer is because if you don’t, ten years down the road, 20 years down the road, you don’t exist anymore.
So obviously, I’m playing with hindsight. Things are way more obvious now. It’s hard for me to convey that, but the thing is, I think many companies have realized that, and that’s why they have hired these data scientists. But they don’t realize that the question is, do you have a strategy where the core executives really play those technical capabilities to their best? And that requires some strategic thinking.
Kieran Chandler: So, do you have mechanical sympathy, which would allow you to have an intelligent, educated discussion with the engineers that are building the engine for your company? Would you say that’s one of the real big blockers when it comes to data science in companies, the fact that executives probably currently don’t have that kind of level of mechanical sympathy that they need?
Joannes Vermorel: Frankly, if you could reinvent companies by hiring 24-year-olds and letting them do the magic, it would be fantastic. But if you look at the history of companies, the number of times things happened that way - hiring one brilliant engineer who happened to completely revolutionize a 50-year-old company from the inside - is very rare. The dominant pattern is either super young people or those aged 45 plus, as they usually have some capital, experience, and maybe some degree of wisdom. It takes a new company to get that.
My advice would be: if you think that hiring people fresh from college is going to revolutionize your company, you’re being wishful. It’s not serious and, I would say, intellectually lazy at best.
Kieran Chandler: Okay, we’ll have to wrap it up there, but thanks for your time. So that’s everything for this week. Thanks very much for tuning in, and we’ll see you again in the next episode. Thanks for watching.