Jump to content

Recommended Posts

Posted (edited)

There've been a few threads recently that have touched on this, but I'm interested in asking directly: How much does Ph.D-level stats coursework differ school-to-school and what's your opinion on what should be taught to first/second year PhDs? There was a discussion recently in which someone stated they believe that most first-year sequences are outdated and do not teach the necessary topics for modern statistical research — do you agree? As I understand most schools will have a probability sequence (typically measure-theoretic) and some sort of theoretical statistics sequence (I hear Casella & Berger brought up a lot here). Are there other staples? Do you feel that these are all important to becoming a research-statistician? What's coursework that many are missing?

In a practical sense, I'm going to have some space to take on courses that could potentially overlap with a Ph.D program and am looking to take things that will help me later on that might not be present in all curricula. One example of something that people have said would be good to take (although may not be in my curricula) is convex optimization.

Edited by discreature
Posted (edited)

The most "typical" required coursework seems to be:

  • 2 semesters of Casella & Berger mathematical statistics 
  • 2 semesters of applied statistics (based on the book "Applied Linear Statistics" by Kutner et al.)
  • 1 semester of statistical computing
  • 1 or 2 semesters of measure theoretic probability
  • 1 semester of linear models theory
  • 1 or 2 semesters of advanced statistical inference

Some elite PhD programs like Stanford and UPenn Wharton skip the first two sequences above because the students they admit are fairly advanced already. 

Anyway: my opinion is that the typical first-year courses are fine for the most part, though they certainly should be updated to incorporate current research topics. If an entering student has not already had much exposure to statistics at the graduate level, then I think it's fine to teach the topics like linear regression, ANOVA, GLM/categorical data analysis, and theory of sufficient statistics, point estimation, hypothesis testing, etc. in detail... though I definitely agree that some of their curricula should be updated. For example, at my PhD program, an entire semester was devoted to different ANOVA/ANCOVA models, including things like split plot design, etc. That seemed a bit excessive to me -- usually, you only need to go over a couple of ANOVA models in detail to get the general gist. So if I were on the PhD curriculum committee, I would probably "modernize" the applied stats sequence (and the statistical computing class) to spend less time on design of experiments and include more modern topics.

Additionally, the advanced statistical inference courses (i.e. the theoretical statistics course(s) you take in the second or third year) at many programs do seem to focus on some topics that are dated. For example, at some schools, you learn to cross every "t" and dot every "i" for "classical" topics like UMP tests, UMVUE, equivariance, likelihood principle, etc., which isn't necessarily helpful for modern statistics research. 

I would probably repurpose the advanced statistical inference classes to cover more 'modern' statistical theory like multiple testing/knock-offs, RKHS and nonparametric regression, convex/nonconvex optimization for high-dimensional regression, graphical models, etc. 

Edited by Stat Assistant Professor
Posted

I agree with @Stat Assistant Professor. Students have been pushing to modernize curricula, but it's difficult because professors are always concerned about prestige or rigor. Some topics I think should always be covered are:

  • Bayesian statistics (becoming more and more used in practice, even being picked up by CS people)
  • Computation / simulation (preferably in C++ / Python and on Unix servers)
  • Machine learning / nonparametric statistics (may be a buzz word, but it gets you jobs)
  • Missing data (very common in practice)

Some topics I think can be tossed out, that are typically required:

  • Measure theory (useful for many people, but not for all)
  • Decision theory (hardly ever used in practice)
  • Anything concerned with unbiased estimation (UMVUE, etc.--most practical estimators are biased so who cares)

I do think UMP and UMPU tests are important, albeit boring, at least for biostatistics. Drug approval ultimately depends on having a significant p-value, so you def. want to have power.

Posted
5 hours ago, StatsG0d said:

I agree with @Stat Assistant Professor. Students have been pushing to modernize curricula, but it's difficult because professors are always concerned about prestige or rigor. Some topics I think should always be covered are:

  • Bayesian statistics (becoming more and more used in practice, even being picked up by CS people)
  • Computation / simulation (preferably in C++ / Python and on Unix servers)
  • Machine learning / nonparametric statistics (may be a buzz word, but it gets you jobs)
  • Missing data (very common in practice)

Some topics I think can be tossed out, that are typically required:

  • Measure theory (useful for many people, but not for all)
  • Decision theory (hardly ever used in practice)
  • Anything concerned with unbiased estimation (UMVUE, etc.--most practical estimators are biased so who cares)

I do think UMP and UMPU tests are important, albeit boring, at least for biostatistics. Drug approval ultimately depends on having a significant p-value, so you def. want to have power.

A lot of departments are in the process of revising their PhD curricula, or at least discussing changes to it. I think most programs will continue to require at least one semester of measure theoretic probability -- at some schools, department Chairs/graduate coordinators are also adamant about keeping the two semester requirement of measure theoretic probability. And linear models will probably stay the same.

But I think the other advanced statistical inference classes (post-Casella & Berger math stat) will eventually be updated to de-emphasize extremely detailed study of "classical" topics. The issue seems to be that for a lot of the advanced classes, the same faculty have been teaching the same class for many years. It takes a LOT of time to design a new course, and in some cases, requires learning new subjects entirely (if you're accustomed to just teaching the "traditional" topics). But once the new class is designed, I think it shouldn't be that difficult to keep teaching it or making minor tweaks to it. Getting to that point takes time though.

Posted

I have a sneaking suspicion someone (likely Stat Asst. Prof) already answered this, but just in case I've misremembered:  is there any school/course that comes to mind as what you want to see in an updated statistical inference course?  The closest thing I've seen is Stanford's 300c (here:  https://statweb.stanford.edu/~candes/teaching/stats300c/index.html).  

(I'm not really experienced enough to have overarching opinions on Stat PhD curricula, but I'll second all the suggestions for more computation, and clarify that in my experience, some emphasis/additional emphasis on algorithm design, numerical linear algebra, matrix decompositions, and maybe factor analysis/matrix-based models would be nice as part of the core curriculum.)

 

Posted
2 hours ago, Geococcyx said:

I have a sneaking suspicion someone (likely Stat Asst. Prof) already answered this, but just in case I've misremembered:  is there any school/course that comes to mind as what you want to see in an updated statistical inference course?  The closest thing I've seen is Stanford's 300c (here:  https://statweb.stanford.edu/~candes/teaching/stats300c/index.html).  

(I'm not really experienced enough to have overarching opinions on Stat PhD curricula, but I'll second all the suggestions for more computation, and clarify that in my experience, some emphasis/additional emphasis on algorithm design, numerical linear algebra, matrix decompositions, and maybe factor analysis/matrix-based models would be nice as part of the core curriculum.)

 

Yes, that Stats 300C class at Stanford is one possibility. I would say that a PhD-level advanced inference class should focus less on topics like UMVUE, Neyman-Pearson Lemma, admissibility, etc., but more on stuff like theory for shrinkage methods, convex/nonconvex optimization, reproducing kernel Hilbert spaces, resampling methods, etc. That's because the latter topics are more of current interest and are active areas of research.

Posted

I think the best first year plan is probably Columbia. They have 4 different tracks: probability, theoretical statistics, applied statistics and data science (joint with CS and managed by Blei himself). Students take different classes (with some overlap) and take different qual exam. This way, no one will waste time. 

Coursework looks to me very rigorous and in-depth within each track. For instance, if you specialize in probability, the probability sequence is 3 semester instead of 2. 

Other than this, it is also good to have a more hands-off approach such as Harvard where courses do not take much time and students can just arrange their own studies (perhaps in consultation to their supervisor). 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

This website uses cookies to ensure you get the best experience on our website. See our Privacy Policy and Terms of Use