luffycullie Posted April 22, 2012 Posted April 22, 2012 Hello everyone, I am currently an undergraduate majoring in computer science. I have an option to take 3-4 statistics classes and need some advice regarding which ones would be the most helpful towards graduate research in the areas of machine learning / data-mining. Also if it helps, within ML I'm interested in exploring decision tree learning and neural networks. The list of statistics courses available are: Concepts in Computing with Data Concepts of Probability Concepts of Statistics Stochastic Processes Linear Modelling: Theory and Applications I Linear Modelling: Theory and Applications II Sampling Surveys Introduction to Time Series Game Theory If there are other courses outside of statistics (say in the math department) that you find very relevant, please suggest them too. Thank you!
jjsakurai Posted April 22, 2012 Posted April 22, 2012 (edited) Depends on your knowledge - for instance if you're comfortable with probability, you probably don't need to take Concepts of Prob. I'd take Stochastic Processes, the two linear modeling classes and sampling surveys. If I wasn't comfortable with basic prob/stats, then I'd take concepts in prob/stats and the two linear modeling classes. Of course, if you can, you should try to take all the classes you've listed. As to math classes - definitely make sure you're very very good with linear algebra. Also if you've space, real analysis and measure theoretic probability. They won't be directly useful to pratical ML research but are helpful in giving you a conceptual framework and if you want to do some theory research in ML. Edited April 22, 2012 by jjsakurai
luffycullie Posted April 22, 2012 Author Posted April 22, 2012 jjsakurai, thank you for your reply. Pardon my ignorance but from what I've read about neural networks, they mainly use non-linear models. Hence would 2 whole semesters of linear modelling be relevant? I would understand if it is important to ML in general but I'm not sure. Can you also briefly explain how sampling surveys would be useful? That was actually lowest on my list because I thought it was the least related. I am a little biased to the financial modelling applications, but graduate studies (masters) is a priority to me. Thus my draft list was stochastic processes, time series and game theory. Do you think I should trade time series and game theory for the 2 linear modelling courses? Thank you for your time
jjsakurai Posted April 22, 2012 Posted April 22, 2012 Hmm...my post was intended for a potential PhD applicant. I'm not sure about Masters. Neural nets are a very small part of ML. While big in the early 90s, they're not used that much these days. Linear Modeling is very very widely used - especially when you have a ton of data and other techniques are computationally too intensive. The reason I suggested sampling survey's is because sampling is used everywhere in ML and knowing the theory behind it can be useful. But if you're interested in financial modeling, etc. then yeah - the time series course is probably a much better idea. Game theory is not used at all in ML/Data mining. Even in finance, it really doesn't have any applications so I'd strongly suggest that you Don't take it.
luffycullie Posted April 22, 2012 Author Posted April 22, 2012 I have interest in pursuing a PhD though for financial reasons as an international student I would have to enter the workforce first after masters. Your comments have been really helpful, I think I would go with stochastic processes, time series and 1 or 2 of the linear modeling courses depending on workload constraints.
j3doucet Posted April 23, 2012 Posted April 23, 2012 Linear modelling is very useful as a foundational course. If you're interested in neural networks, much of that community has moved into support vector machines, which are very mainstream. Probably the best thing you could do though would be to get some experience working with real world data, and with the machine learning system's you're interested in. There are a number of repositories of free data (e.g. http://archive.ics.uci.edu/ml/). Go there, download a likely dataset, and then implement a simple back-prop neural network or the ID3 decision tree learner. You can compare your results with those generated by Weka (http://www.cs.waikato.ac.nz/ml/weka/). The theory you could learn in classes is all well and good, but unless you just want to do pure theory, the application details are going to matter more.
tkulk Posted April 23, 2012 Posted April 23, 2012 (edited) Well, traditional neural nets are not the best performing but it is very much an (re)emerging field. Majority of work is in deep learning and cortical models such as hmax. I would take a deep learning course as an advanced elective if it's offered. Edited April 23, 2012 by tkulk
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now