Jump to content

Stats ML PhD (Profile Evaluation + Recommendations)


Recommended Posts

Hey everyone! I wanted to get a better sense of my prospects for a PhD in Statistics, aimed at ML, mostly to get a better sense of what schools I should be applying for (have some specific questions on that at the bottom of the post). But anyway, here's an overview:

Undergrad Institution: Princeton University

Undergrad Major: Bachelor of Arts in Mathematics, Certificates in Applications of Computing (CS) and Statistics/Machine Learning

GPA (Undergrad): 3.725 department, 3.65 overall

Type of Student: Domestic

Relevant Courses(Undergraduate): 

Honors Analysis (A), Honors Lin. Alg (A), Analysis II: Complex (A-), Topology (A-), Discrete Math (A), Theory of Algorithms (A), Fundamentals of Stats (A-), Optimal Learning (A), Neural Nets: Theory/Apps (A-), Analysis of Big Data (A), Junior Seminar: Analytic Number Theory (A-), Junior Paper (A), Senior Thesis + Oral defense (A, A), Real Analysis (B), Abstract Algebra (B)

Relevant Courses(Graduate):

Fairness in ML (A), Theoretical ML (B+), Machine Learning & Patter Recognition (B+)

Relevant Research: Princeton requires doing an undergrad "junior paper" and "senior thesis", so I did mine in applied game theory and applied ML/data analysis respectively. I also did research in mathematical modelling with a professor from Columbia over freshman summer. And, although not really "research," I did some projects that extended papers in some of the grad classes I took (specifically (1) creating a probabilistic ML fairness checker for scikit-learn, (2) a policy gradient exploration for Tesla charging station locations, and (3) implementing an x86 neural branch predictor).

Recs: 3 strong rec letters (undergrad thesis advisor, Columbia professor, and current manager)

GRE General: V/Q/A: Haven't take yet, but assume ~165/167/5 or so

GRE Math Subject: Haven't taken (considering taking, but logistics seem a little screwy with coronavirus)

Programs Applying: Statistics PhD

Current status: Working as a computer graphics/vision engineer at Facebook

I'm mostly worried about my real analysis/abstract algebra course, and some of the grad classes I took in ML. Of course, there's nothing I can do about the grades now (I've studied it after graduating and understand it a lot better now and really love the material! Guess I was just missing something when I took the class, but eh)

Anyway, with that out of the way, I would love to gauge the schools that seem reasonably well matched with these stats. This is largely to find a set of schools to which I should apply. Also, with that said, I'm still quite early in my search for labs -- I'm really interested in doing research in the theory side of ML (hence the application for a stats PhD vs. CS). Would anyone happen to know labs doing interesting work? Thanks anyone who read through this long post -- I really appreciate you taking the time and would love to hear anything you have to say!

Link to comment
Share on other sites

I think you have a good shot at getting into some of the top programs. I don't think a few B's will tank your application, especially since you have A's in similar courses. It might help to have one of your recommenders speak specifically to your ability in those courses. I'll defer to more seasoned members to speak more in-depth on your chances though.

Searching for labs before you're accepted isn't common in statistics as it may be in CS. You're accepted to the department instead of a particular lab. I'm not intimately familiar with all of the research going on at different institutions, but several departments have a fair number of people in ML theory and lots of departments are moving towards ML. If you have a particular topic in mind that may help in giving recommendations.

Link to comment
Share on other sites

ML is more like a buzz word. What is exactly the kind of thing you want to work with? Jianqin Fan is a top stats prof at Princeton. I think what he did now mostly classify as "ML". Tracy Ke, who is now a prof at harvard, was his student. I went to her lecture a while ago and it was quite packed because it was considered ML. But she worked on high dimensional lasso stuff which is very different from say CNN or other neural networks. To give you some perspective I took a "ML" class with Leslie Kaebling at MIT. She also specialized "ML". But what she did is RNN/MDP in robotics. This again is very different from what Jianqin Fan (as a statistician) is doing. It seems that your interest mostly lies in CS. Why aren't you apply to CS PhD in computer graphics? It is true that some students in stats do work in the kind of stuff you are interested in but not very prevalent.

 

Link to comment
Share on other sites

You have a strong background from an Ivy - one or two Bs are not going to be a dealbreaker.  I probably wouldn't bother with applying to Stanford, but I could see you possibly getting in pretty much anywhere else.  Could see you getting into some top 10s, and you'll get into a lot of good programs between 10-50.  

I'd focus on looking at program faculty and narrowing down your research interests a little more.  ML is pretty broad.  There are not a ton of people doing deep learning stuff, for instance, but you can find them at some departments.  High-dimensional statistical ML (like LASSO type stuff) is very common though, if you consider that machine learning.  Some people work on reinforcement learning for clinical trials.  UT-Austin has people doing Bayesian nonparametrics.  These are all very different things, but are all "ML."

Link to comment
Share on other sites

Wow, thanks for all the replies! It seems like the vagueness in my original post makes this a difficult question to answer. So, to clarify on my interests (just a quick background), I was rather torn between applications and theory in my undergrad studies (hence the split between CS courses and the more theory focused courses) and why I have worked in industry for 1 1/2 years. But now, my interests are in fields like statistical learning theory and perhaps reinforcement learning (i.e. bandit problems, policy methods). I clearly still need to spend time refining this set of topics, but off the top of my head, Peter Bartlett's work at Berkeley (https://people.eecs.berkeley.edu/~bartlett/) looks super cool to me! So, any group that is doing similar work would be probably what I'm looking for.

Also, as a somewhat more blunt question, what does the "top 10" refer to here. (I realize that rankings are quite arbitrary, but it's nice to get a sense of the places that other academics consider as having interesting work going on) There's the stats top 10 (from US News), but I wasn't sure if that really mapped to the ML subfield (as I narrowed it above). Does anyone have a sense of what the top schools (other than Berkeley, Stanford, MIT, and UW) would be? Thanks again!

Link to comment
Share on other sites

The most-used rankings are the US News rankings: https://www.usnews.com/best-graduate-schools/top-science-schools/statistics-rankings
These combine statistics and biostatistics programs, so they're intertwined.  I used "top 10" as an arbitrary approximate ranking referring approximately to the programs on that list from Stanford through UPenn (#12)

For reinforcement learning/bandit/policy stuff, I'd look at Harvard (Susan Murphy), NCSU (Laber), UNC Biostatistics (Kosorok), UW (Luedtke) off the top of my head.

For ML more broadly, CMU has a stat ML lab group.  You'll find people doing some type of ML stuff at pretty much any top department, though it may become rarer and less theoretical as you go down the list.  I'd take the time to look through faculty at any school that might be possibly interesting to you, as it's the only way to make sure you don't miss anything.

MIT, like Princeton, doesn't have a dedicated statistics department but has some people doing stat ML in other departments.  Stanford has a lot of people doing LASSO/compressed sensing stuff that was founded there, but stats departments in general have a broad range of research going on, and ML is done in many, so I don't think it's very useful to talk about ML subfield rankings especially since it's hard to draw the line between stats and ML a lot of the time.

Link to comment
Share on other sites

It looks like we have similar interests. I did some research and to me these are some of the top schools and faculty in machine learning theory from the statistics side. MIT actually does not seem to have a ton of faculty working in the statistics side of ML theory. Other than Tamara Broderick, faculty in the MIT Statistics & Data Science Center (https://stat.mit.edu/people_categories/core/) seem to focus on topics in extremely mathematical statistics (such as optimal transport, etc.) or high-dimensional econometrics.  

Stanford statistics (Tibshirani, Hastie, Duchi, Ma)

Duke (Dunson, Rudin, Parr)

CMU statistics / ML (entire list of statml theory group faculty) - seem to have a lot of junior faculty / more recent hires working in this area

Berkeley statistics (Yu, Wainwright, Jordan, Steinhardt, Bartlett)

Univ of Washington statistics / biostatistics (Shojaie, Witten, Harchaoui, Kakade) - assuming you place out of the masters level coursework

Univ of Michigan statistics (Nguyen, Regier, Tewari)

Link to comment
Share on other sites

7 hours ago, bayessays said:

For reinforcement learning/bandit/policy stuff, I'd look at Harvard (Susan Murphy), NCSU (Laber), UNC Biostatistics (Kosorok), UW (Luedtke) off the top of my head.

I think Wager (Stanford) and Moodie (McGill) also work in this area.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

This website uses cookies to ensure you get the best experience on our website. See our Privacy Policy and Terms of Use