Data science is like Jiu Jitsu. The comparison seems odd, but might not be as ridiculous as it sounds at first.
Jocko Willink is someone who I follow and I think really gets it. On his podcast he often compares lots of business or leadership issues to Jiu Jitsu. You can do the same for data science, I’ve noticed three big ways:
Data science requires practice
While talent plays a factor, data science success is all about practice. Similar to how in the first month of training a new white belt will be dominated by white belts with 6 months experience, brand new analytics professionals will require help from those with just a little experience to navigate early hurdles.
There are common errors you will see time and again that proper experience allows you to navigate around. It is not just code; things like experiment design pitfalls and implementation difficulties are understood through experience.
You cannot jump right to a solution
In Jiu Jitsu you cannot go directly for a submission. If you try to armbar someone without getting them to expose their arm, it will not work.
Data science is similar; you need to prepare your data and frame the problem to be modeled. Your experiment design is like the setup moves; without it your model will fail or get you into trouble.
Simplicity rules in the real world
Oh no, my neural network identified Yoda as a guy on a skateboard? What do you think Jocko?
Good. Modeling is hard, this gives you a chance to expand your training data selection.
In both data science and Jiu Jitsu, the simplest method is typically the best. Complicated submissions like Eddie Bravo’s twister can work, but the vast majority of successful submissions are simple ones like arm bars and rear naked chokes.
This same maxim applies to problem solving algorithms. Cool solutions liked neural networks with tons of layers are interesting and powerful. However, the overwhelming majority of models in production are simple ones like linear regression. In practice you should make sure any improvements over a simple regression are necessary and not just over fitting the model.