 Who are our superstars?

Wed Oct 28, 2015 12:51 pm

Couldn't figure out which part of the forum was appropriate.

I'm thinking about the part of the homework that asks us to read some biographies.

Specifically I'm curious who the data science world considers it's Tim Ferris types? These are the people that you wouldn't necessarily reach out to.

So, Who are the superstars in this field? What specialties do they hold?
Wed Oct 28, 2015 11:41 pm

That's a really good question Sean. I'm excited to hear Theo's and others answers to that.
Thu Oct 29, 2015 12:42 am

There are a lot of big names in the field, the ones that come to my mind. I will comment that these names are really only representative of the academic track. Data Science outside of academia is very different (and since I'm pretty much just out of undergrad, don't know anyone in that area). This is necessarily a very compressed and subjective list. There are a large number of very, very good researchers that I have omitted whose career paths may be more relevant. Similarly you should recognize that a lot of ML / Data Science work is domain specific. There is a lot of good work by database people, astronomers, computational biologists, etc that is omitted here. Also I have omitted many younger researchers who are currently making big contributions to the field. Of particular note is the omission of much statistical work. There are many statisticians of note that I have not mentioned primarily because I have tried to confine this to more ML. However there is a lot of important stats in high-dimensional data, Friedman and so on are responsible for a lot of the theory of the lasso, and high dimensional PCA. If people have a specific interest outside of this please write.

Vladimir Vapnik (NYU) & Alexey Chervonenkis - Responsible for SVMs and VC-dimensions, SVMs + kernels dominated machine learning in the 90s and the last decade. VC-dimension is still one of the cornerstones of advanced theory in statistical machine learning. Most of their findings are summed up in the book Statistical Learning Theory by Vapnik, but it's a REALLY dense tome and probably isn't the most enlightening source on these topics.

Michael Jordan (UCBerkeley) - Very big name in statistics, runs a huge research lab in Berkeley.

Yoshua Bengio (UMontreal), Yann LeCun (Facebook), Geoffrey Hinton (Google) - Are the three biggest names in deep learning. Virtually all researchers in Deep Learning are their students.

Andrew Ng (Stanford) - Hugely cited, director of Baidu's silicon valley AI lab, founder of coursera. He's a big name in the field, but to be honest I can't think of anything special from him. He does seem to have a large background in Deep learning because that's what he does with Baidu (they poached him from google).

Leslie Valiant (Harvard) - developed probably approximately correct learning model.

Tom Mitchell (CMU) - Has been in the field for a long time. Lots of work. In my opinion the most interesting work he's done is the NELL project, essentially they built a web crawler that continuously scrapes the web and learns relationships (I was thinking of trying something similar with a dynamic neural network)

Daphne Koller (Stanford) - Probably best known for probabilistic graphical models (course on coursera + 1000 page book), does a lot of work on bioinformatics.

Sebastian Thrun - Probably the closest to non-academic I can think of. Still a professor at Stanford but has done a lot of work in automation, self-driving cars etc.

Pedro Dominigos (UWashington)
Ellen Koenig

Thu Oct 29, 2015 3:38 am

Great list, Theo.

I would like to throw in a few names from the more applied / business DS side
* DJ Patil (formerly Chief DS of LinkedIn, now Chief DS of the US)
* Hillary Mason (former Chief Scientist at, now an entrepreneur herself)
* Nate Silver (Statistician, most famous work on baseball stats and predicting US elections)
* Jeff Hammerbacher (Founder and Chief Scientist at Cloudera)
* Christian Rudder (Co-Founder of OKCupid, blogged at OKTrends about the stats side of that site)

There are a few more in this list: The World's 7 Most Powerful Data Scientists - Forbes ( I cannot post links for the first 7 days, so you have to Google this if you are interested)

Personally, I am also very interested in finding famous data engineers, since (apart from Wes McKinney, the creator of Pandas) I mostly know their products, but not who build them. If I find the time, I will research and post a list of those as well.
Thu Oct 29, 2015 3:36 pm

These are some great lists.

Thinking about it, I see a lot of folks I should have considered. Much like what you said Ellen, I know the products but not the engineers.

I'll dig into this some more when I get home tonight. Thanks!
Sat Oct 31, 2015 3:07 am

If you're into logistics, Kevin Novak of Uber
