jpmangogg Posted June 20, 2013 Posted June 20, 2013 Other than the stats languages like R and SAS, are there any important programming languages/skills that may be important for statisticians to know?
cyberwulf Posted June 20, 2013 Posted June 20, 2013 Possibly C, but only for academic researchers. If you're doing stat genetics or big data, languages like Python and Perl could be handy to pick up. 33andathirdRPM 1
timmmythetooth Posted June 21, 2013 Posted June 21, 2013 If you are exclusively doing statistical analyses of relatively small datasets, without the need to really interface with larger systems/applications, R alone is probably fine. For anything more, I'd recommend also learning Python since it is both extremely versatile and easy to use. C is a very important language in the grand scheme of things, but it's not for the faint of heart. Unless you really need to write lightning-fast code doing things in Python is fine. Finally, some knowledge of databases could definitely come in handy. And unless you have reason to do otherwise you should start with some form of database that uses SQL (mySQL, PostgreSQL, etc). Not too hard to learn either. wine in coffee cups 1
DMX Posted June 21, 2013 Posted June 21, 2013 Depends on your research interest, but I would say (in order of decreasing importance): R, MATLAB, and C++. Emphasis on the last two if you are interested in developing algorithm
notafrequentist Posted June 22, 2013 Posted June 22, 2013 Python is great to learn, and it's not too difficult to pick up. There are plenty of online tutorials for it. I recommend both codingbat and codecademy, but be aware that sometimes codecademy will reject code even when it's entered completely correctly. Also pick up numpy, scipy, and MATPLOTLIB, as those are free scientific computing addons that are free Python alternatives to MATLAB. Python also supports Sage, which is a free symbolic calculations package that's a solid alternative to Mathematica.
wine in coffee cups Posted June 25, 2013 Posted June 25, 2013 For analyzing data, R is the biggie. You'll probably see a fair amount of SAS if you are in biostatistics (more the traditional clinical trials side than genetics). Statisticians don't use Stata or SPSS, but collaborators/consulting clients in the social sciences use those heavily, so just passing familiarity is nice to have in those cases. C and MATLAB are more specialized -- I'm sure some people in my department use these all the time, but the majority basically never. I don't like to do heavy data cleaning and manipulation in R, and certainly not for anything large. For general-purpose data processing, Python is useful. For any language, knowing regular expressions is very, very useful for cleaning up messy data. I definitely recommend picking up SQL for data extraction and aggegration (which you can use right in R or SAS)--some of the students who were leaving with a master's and looking for industry jobs found that they often wanted SQL skills. For papers, reports, and presentations, you need to learn LaTeX. I used knitr in RStudio to integrate the TeX with R output and graphics seamlessly.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now