Statistics, profile eval, programs to apply, safe school choices

DanielWarlock · October 19, 2019

I'm applying to PhD in statistics (primarily) and CSE programs (computational sciences) this year. It'd be very helpful if you can take a look at my profile and give me some advice on programs to apply to, especially ~2 safe schools.

My profile: I'm currently doing a master at Harvard (CSE masters, GPA: 4.0/4.0). I did my undergrad at University of Toronto (Engineering Sciences, GPA: 3.92/4.00). No GRE math (I took it without sufficient prep and got a terrible score--76%) and will probably not reveal it to anyone. General GRE: 166(V), 168 (Q), 4 (W).

Recommendation Letter: Both professors at Harvard knows me and my recent work very well -- one is my advisor and I have discussed research with the other prof frequently; Nailed their classes (PhD inference II and probability II respectively). My advisor said he wanted me to continue working with him as PhD but cannot guarantee admission. I'd like to say both profs think highly of me but that could be out of politeness. The other one is my old senior thesis advisor, who does not know my recent progress but he is very well known in the community. I think my work with him is somewhat solid but not very good.

Work experience: e-trading developer at Citigroup, quant research intern at a pension fund (OTPP). All these are like 2 years ago and I have left finance stuff since.

Research: there is probably no time for my work to be peer-reviewed before application. I know this is very bad. I need to rely on my advisors to testify on the quality of the work. Will those guys in the admission committee at least take a look at the preprint or they just don't care? Anyways, two Harvard profs know my most recent work and I have to trust them to talk about it at least a little.

School list: These are schools I want to get into--I have not included safe schools. I will probably fail all these. So I need about 2 safe schools--and I will substitute them into the list. What are some of the good choice? Notice that I excluded Stanford statistics because that is a waste of time with my math GRE. I may drop Chicago for some safe schools as they seem to have a penchant for people with crazy math track record incl. high math GRE. Any advice on the list in general is appreciated.

List of research article:

Li, Yufan., Pillai, Natesh "Regeneration Sampling with Multiple Atoms." preprint (2019). [Link to Article]

Li, Yufan., "Markov Chain Monte Carlo Algorithms and Related Convergence Properties." Undergraduate Thesis, University of Toronto, [Link to Article]

Li, Yufan., Rosenthal, Jeffery "A Divergent Random Walk on Stairs." arXiv preprint arXiv:1808.10121 (2018). [Link to Article]

Li, Yufan. "Improve Orthogonal GARCH with Hidden Markov Model." arXiv preprint arXiv:1909.10108 (2019). [Link to Article]

Relevant Courses: I did not take a tons of classes at Harvard because I need to work on research so just the bare minimum: standard phd course on probability and inference.

Statistics (Standard PhD Sequence)

STAT 210 Probability I (Harvard), A

STAT 212 Probability II (Harvard), A

STAT 211 Inference I (Harvard), A

STAT 212 Inference II (Harvard), A

STAT 230 Multivariate Statistical Analysis (Harvard), Ongoing

STAT 244 Grad. Linear & Generalized Linear Models (Harvard), Ongoing

STAT 220 Bayesian Data Analysis (Harvard), Ongoing

Computing,

AM 205: Advanced Scientific Computing (Harvard), A

CS 207: System Development for Computing (Harvard), Ongoing

CS 205: Distributed and Cloud Computing (Harvard), Ongoing

Mathematics,

MATH 122: Group, Ring Theory & Vector Spaces (Harvard), A

MAT 336H1 Real Analysis (UofT), 99

MAT 334H1 Complex Analysis (UofT), 100

MAT309H1 Math. Logic (UofT), 100

APM 384 H1 Partial Differential Equations (UofT), 100

MAT292H1 Calculus III (UofT), 100

MAT195H1 Calculus II (UofT), 99

MAT191H1 Calculus I (UofT), 97

MAT185H1 Linear Algebra, 92

Edited October 19, 2019 by DanielWarlock

statfan · October 20, 2019

You have excellent grades in your math and statistics courses. However, your list is very top heavy and your math background is a bit lacking for these top schools for statistics. I noticed that the real/complex analysis courses you took tend to be more computation-focused rather than proof-based. You would be better off if you took the advanced version of real/complex analysis etc. I think slightly lower grades (say 85-90) in more rigorous courses will look more impressive than near perfect grades in easier courses.

Edited October 20, 2019 by statfan

DanielWarlock · October 20, 2019

Thank you statfan. I don't have a pure math background in my undergrad so these are mostly engineering's requirement. I thought that taking graduate probability and inference during my masters would help. But yes the competition is stiff and my coursework is definitely not the most impressive. What range/schools would you recommend me to apply to to be safe?

statfan · October 20, 2019

I would mainly focus on larger programs such as Penn State, NC State, Iowa State etc. These are very solid programs and are easier to get into than their rankings would suggest. Since your math background is a bit thin, I would submit the math GRE score to schools that recommend it. Most test takers of the math GRE test are applying to math PhD. A 76th percentile, by definition, is not a low score, especially with your math background.

Edited October 20, 2019 by statfan

DanielWarlock · October 20, 2019

My plan is to do a bunch of reach schools and 2 absolute safe schools. The schools you listed may not be actually very safe for me since they are still quite good. What about 9 crazy schools plus UIUC and University of Florida? I should be a shoe-in for the latter two schools right? Is university of Florida that easy to get into? I'm thinking replacing UIUC with Chicago if that's the case.

statfan · October 20, 2019

There is no actual safe schools for statistics PhD admissions. It does not hurt to try a few reach schools but I would recommend you apply broadly since your math background is weaker than average applicants at those top schools you listed. I would apply to at least 3 schools in Penn State - Florida range.

Edited October 20, 2019 by statfan

DanielWarlock · October 20, 2019

"A quick update: I got 760 on the Math subject GRE test, which is the 72th percentile. I don't know anything about abstract algebra, complex analysis and graph theory, so I consider it an ok but not great score. I already submitted to Stanford and UPenn, and I am wondering if I should submit it to schools..."

Thanks statfan. I saw you probably had a similar profile as I do and probably similar math background (not in pure math). Did you send your GRE math score to all schools? What is your application results? Where did you end up?

Posted October 20, 2019 · October 20, 2019

I do think you should add a couple lower schools, but I don't think you need to apply to schools outside the top 25 and I absolutely think you should apply to the schools you listed. You have awesome grades and a great background from two top schools (your math background is more than sufficient), good GRE and your research is way beyond what most people have. I'm assuming you're getting a letter from Rosenthal, who is a one of the most famous statisticians in the world, and whom you wrote a real paper with -- your profile will really stick out.

I tend not to want to give evaluations for international applicants because I'm less familiar with the criteria, but your profile is really stellar and I would personally be shocked if you didn't get into a few top 10 programs.

DanielWarlock · October 20, 2019

Yes. I went Monte Carlo/MCMC all the way since doing a thesis with professor Rosenthal. Both profs at Harvard are experts in MCMC as well and hopefully they will say I know what I'm doing (somewhat). I really loath taking courses, especially high-powered stuff such as "graduate real analysis I, II" or "graduate algebra I, II"--this is taught at Harvard by some crazy person and he went way beyond standard graduate text such as Royden, Rudin etc with PDE and harmonic XXX because math phds at Harvard are that good. In a perfect world, I'd taken such classes and got an A but it is simply too much stress and so unnecessary for someone who does statistics. Plus I'd probably have to do zero research to cope with classes and thus land bad letters. I read baby and papa Rudin on my own instead and studied linear algebra again in details in grad school. Even outperformed some (if not most) Harvard PhDs in their probability and inference classes. I totally flopped the math GRE despite my love for it. It has many clever problems with emphasis on basics. I still don't know how I can possibly fail that bad. I literally fell down the stairs seeing the score (resulting in minor injuries).

oh well it actually doesn't matter that much. Not everyone is meant to be a scientist. I have decided to take my chance with 9+2 formula with UIUC and UFlorida. I don't actually want Chicago that bad upon looking at their faculties. It is also said that there is a "cut-throat" culture so I might not even survive even if I got in. UIUC is a good computer school (maybe chance to collaborate) and Florida has monte carlo people plus top-notch living condition. Both seem like great deals. The other 9 schools, Michigan and Duke to a lesser degree, are highly reputable--so no regret in missing out on the "prestige". And if I can't get in any of these, so be it. Time to go home anyways. Thank you for your advice, statfan and bayessays! Good luck with your research and stuff..

Stat Assistant Professor · October 20, 2019

I think you have got a pretty good shot at UF and UIUC. UF is pretty renowned for MCMC and has great placements. One guy from UFlorida got a faculty job at University of Minnesota this past year (a well-regarded program) without any postdoc, because he worked on cutting edge stuff on MCMC: https://cla.umn.edu/about/directory/profile/qqin

The job placements at UF and UIUC in general are also pretty good if you are interested in faculty jobs. You'll see PhD alumni from these schools in postdocs at places like Columbia, UPenn, and Carnegie Mellon, and there are professors with PhDs from these schools at places like Duke Statistical Science (https://resteorts.github.io/), Harvard Biostatistics (https://www.hsph.harvard.edu/brent-coull/), and UT-Austin Statistics & Data Science (https://cns.utexas.edu/component/cobalt/item/19-statistics/4025-linero-antonio?Itemid=349).

Edited October 20, 2019 by Stat PhD Now Postdoc

DanielWarlock · October 20, 2019

Thank you "Stat PhD Now Postdoc". I'd assume you are associated with UF given you detailed knowledge of the job placement. Do you think I should submit my GRE math score for UF and UIUC? My guess is that as they are less competitive than schools like Stanford or Chicago, a 76% may actually help my case at UF and UIUC? I could be wrong though since I have no idea what a typical pool of applicants would look like.

Bayequentist · October 20, 2019

IIRC Stat PhD Now Postdoc mentioned that he did his PhD at UF.

You can search the database of gradcafe for GRE scores: https://www.thegradcafe.com/survey/index.php?q=uiuc+statistics and https://www.thegradcafe.com/survey/index.php?q=university+of+florida+statistics.

Stat Assistant Professor · October 20, 2019

1 hour ago, DanielWarlock said:

Thank you "Stat PhD Now Postdoc". I'd assume you are associated with UF given you detailed knowledge of the job placement. Do you think I should submit my GRE math score for UF and UIUC? My guess is that as they are less competitive than schools like Stanford or Chicago, a 76% may actually help my case at UF and UIUC? I could be wrong though since I have no idea what a typical pool of applicants would look like.

That is a respectable score, but I would caution that many other international students will have higher math subject GRE scores, even at schools like UF and UIUC. A lot of the Chinese students at UF and UIUC in particular will have very high math subject GRE scores (95+ percentile), and your application will be compared to theirs more than it will to domestic applicants. I would keep that in mind. As these schools do not require submitting the subject GRE score, it might not be worth submitting it.

DanielWarlock · October 21, 2019

That's exactly what I thought. I will not submit the score anywhere. Stanford average on that test is like 90%+ this year as well, that's with domestic included. This test has become quant section of general GRE. Given that everybody gets it perfect on courses and GRE, I think if I ever get in anywhere it would be because my research and reference. No need to divert attention to other stuff. We will see what happens.

Stat Assistant Professor · October 21, 2019

Is there any particular reason why you do not want to apply more broadly than 9 top schools and 2 "safety" schools? Admissions is still competitive for international students at the level of UF and UIUC. It may help to apply more broadly to a range of schools in the USNWR top 40 rankings to improve your chances of being admitted to a good program. You don't need to go to a "top-tier" program to land a good job post-graduation (though it certainly helps). For industry, the reputation of the school is not so important as long as the doctorate is not from an obscure regional/directional program. And in order to get good postdocs and faculty jobs, your publication record and the reputation of your PhD/postdoc advisors (and other recommendation letter writers) are what matter the most.

DanielWarlock · October 21, 2019

I think you are right to a degree. First is the pragmatic reason, I chose to do 9+2 because my friend applied last year and he did exceptionally well by applying to only top schools, receiving only 2 out of 13 but very high-quality admissions (he applied to 13 programs in total). The reasoning was that I just need to be admitted to 1 school and to have too many safety choices would be a waste. Plus, safety is not really safe and reach may not be reach after all given all the other factors such as interest match. But I decided to also do UNC now and dropped Princeton from my list just because UNC has several MCMC faculties and Princeton had none. The new list is now: Harvard, Berkeley, MIT(CSE), Stanford(CSE), Columbia, Upenn, Duke, Michigan, UNC, UIUC, UF.

Yes it is a top-heavy but I think the risk is not actually that bad: I should have 50% chance getting into each of the last three schools on average given that UNC and UF has several MCMC faculties. I will say 10% chance for Harvard, Duke, Michigan each on average (my advisor actually said 80% chance which I will take as hyperbole). Now I have like 9.1% chance being rejected by all these schools. I will not count my chance into Berkeley, MIT(CSE), Stanford(CSE), Columbia, Upenn just to offset what inflation I might have included. But Upenn, Columbia and Berkely all have dedicated MCMC faculties whereas MIT and Stanford need to admit people with interests in statistical computing and randomized algorithms; not as many stats people will apply to those programs so I might just get lucky from interest match. Now I have below 10% chance of failing completely.

Let's say even if I was being conservative already, the actual odds are worse. The thing is that I already have operated under optimal conditions in undergrad and master level (best schools, best advisors, no financial worries) and have already worked my butt off in a way. This is not like someone from India or China who are very talented but are put into a disadvantage due to their background and origin. The point is that if I were still to be rejected by all these programs, then I am proven to be mediocre beyond reasonable doubt and would be better off doing less challenging works and leave research opportunities to others. That's why I said so be it.

Posted October 21, 2019 · October 21, 2019

You may want to look into UT Austin, which has some world-class MCMC people.

DanielWarlock · October 21, 2019

Bayessays. I checked the faculty but didn't see any MCMC people here. Which professor are you referring to? Thank you!

Page I checked:

https://stat.utexas.edu/people/core-faculty

Posted October 21, 2019 · October 21, 2019

Stephen Walker does a lot of Bayesian algorithm stuff (lots of nonparametric/Dirichlet process, but look through his very long CV and you'll find MCMC/Gibbs sampler papers). Sinead Williamson works directly on parallelizing MCMC algorithms. The whole department is Bayesian, so I'm sure some others do at least done related stuff, though Bayesian inference is moving towards optimization-based approximations more recently, so it may be harder to find lots of MCMC people anywhere.

Two other departments you may want to look at are Minnesota (Galin Jones works on actual MCMC theory and may be a great fit for you) and Iowa, which has a lot of MCMC people.

DanielWarlock · October 21, 2019

Thank you for reminding me! Non-parametric Bayesian, Dirchlet, urn representation, Chinese restaurant is a standard area in MCMC. I almost forgot that. In fact, I think my current research may be extended naturally to such problems to obtain easy convergence diagnostics. Very interesting idea actually. I hope I thought of that 2 months ago. Another interesting thing you reminded me about non-parametric would be Gaussian process stuff. In general, the auxiliary dual space to original MCMC may be defined as space of functions/distributions rather than a partition cell of state space. I will see if I can work out some interesting use cases and will definitely keep that in mind when discussing potential with nonparametric faculties. Thank you very much.

As a side note, I'm aware of the big data trend in Bayesian as far as sampling is concerned. The posterior becomes too hefty and parallelization does not seem to be a direct solution to that. There are some "subsampling", "approximate/perturbed chain" techniques and bounding techniques but still an open problem. Is there a way to address that using optimization? Industry standard seems to be just randomly pick a subset from all the data each time, sometimes with tailored hypothesis. A neat idea in regression I saw the other day is to assign a distribution to each multiplicative factor to draw sample from that as what you would you for computation. This distribution is chosen so that the variance of estimator for coefficients is minimized. I could see some adaptive algorithm being developed using analogous idea for Gibbs sampler for posterior Bayesian, i.e. each step is taken with consideration a subset of all data but the selection of this subset is adjusted gradually from uniform towards optimality.

Oh well, I can go on and on about Monte Carlo. Might not even have a chance to work on these things if I failed my PhD application. We will see what happens.

Stat Assistant Professor · October 21, 2019

An emerging area in the MCMC literature right now is approximate MCMC, where you replace the Markov transition kernel with a low-rank approximation so that it is faster than vanilla Gibbs sampling/MH algorithms. James Johndrow at UPenn Wharton works a lot on this area, and you can check out some of his papers. In addition, I have seen Bayesian coresets work being done, where you approximate the full data set with a much smaller, weighted random subsample at each iteration (so you can run MCMC faster on the weighted subsample than the full data set): https://arxiv.org/abs/1605.06423

I think MCMC and its related theory is still an active research area, but it is a bit more difficult to publish papers on it unless it is truly state-of-the-art (for application or theory). So papers that simply verify geometric ergodicity for a model using the "traditional" drift and minorization methods may not fly well for the top journals. But if you work on something very state-of-the-art, it should be fine. The guy I linked to above, Qian Qin at University of Minnesota (a PhD alum of University of Florida) has initiated several new tools for theoretically analyzing MCMC which were not previously considered (e.g. using Wasserstein-based methods). I think there will be a lot of interest in MCMC in the future, as long as it can assert its relevance to "big data" through things like approximate MCMC, weighted subsampling schemes, etc.

DanielWarlock · October 21, 2019

Exactly. That's the major reason I applied to Upenn instead of Chicago. I spent a summer with Rosenthal's PhD (Jun Yang if you know him) doing this approximate chain stuff. The idea was to generalize Rosenthal's 1995 quantitative bound so that it provides complexity instead of exact bound for higher dimensions. The catch is that Rosenthal's minorization and drift set-up blow up to 1 very quickly so in high dimension it falls apart. So Jun's idea is to add a "large set" in addition to "small set" so that all ill conditioned parts of state space are discarded. It's a neat idea but the amount of calculus is over the roof even for basic stuff like Stein-estimator and is really not practical at all. I thought a lot about this but didn't end up writing anything down. There are basically no good theoretic work on this neat enough for those who do not have expertise. Natesh Pillai and Aaron smith at UOttwa had a paper where the bound assumes a lot of things which are not practical to verify. The tools are modified from discrete chain literature such as conductance. We actually talked to Smith and he thought it was his "less proud" paper. I talked to Natesh and he was not even that interested in this anymore so we worked on this dual space/discretization projects which has essentially turns a continuous chain to discrete and thus has high-dimension application and is very easy to implement.

I feel the subsampling routine with artificially defined "optimal weights" are the way to go. Take a look at this paper: https://www.tandfonline.com/doi/abs/10.1080/01621459.2017.1292914

Now, what if I want to do this with Bayesian posterior, or what if sample size are so large for the original logistic regression problem. The idea of monte carlo is essentially to address "big data" situation like this where the whole picture is unknown due to size of the problem but local information is available such as a subset of entire data. So long as you can find a set up, it will give you exact solution asymptotically. I will focus on talking about this kind of stuff because I thought even machine learning community would be interested.

Edited October 21, 2019 by DanielWarlock

Sign In

Statistics, profile eval, programs to apply, safe school choices

Recommended Posts

DanielWarlock

statfan

DanielWarlock

statfan

DanielWarlock

statfan

DanielWarlock

Guest

DanielWarlock

Stat Assistant Professor

DanielWarlock

Bayequentist

Stat Assistant Professor

DanielWarlock

Stat Assistant Professor

DanielWarlock

Guest

DanielWarlock

Guest

DanielWarlock

Stat Assistant Professor

DanielWarlock

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Search

Results

Important Information