Seeking for advice on rather sensitive reality of ML/stat/Data related research

Statmaniac · March 6, 2021

Hi all,

I want to ask for advice from more experienced researchers in this forum on some of the ongoing thoughts I have towards ML/stat/algorithmic/computational research. As a PhD student researching machine learning/statistics, I have been overwhelmed by a chain of negative aspects I saw in the actual research phase. Below are some features I find very uncomfortable staying in these fields.

1. A significant portion of research papers are not reproducible. I understand writing a paper that perfectly explains all the tiny-gritty details of the methodology is very difficult, but at the same time, I realized that there are just way too many papers not explaining details enough for readers actually to use methods they introduced in the papers. What is worse is that a significant amount of papers are error-prone, and I have seen reviewers simply disregarding such errors treating them "trivial". An upper-year student who graduated last year had found at least three major errors in his advisor's previous paper, but his advisor ignored them. He fixed all the theories by himself, submitted the correct version of the proof, and got published with his advisor's name on it. Moreover, this was published in one of the top 4 venues, which statistics people would highly consider. His advice to me was simply to accept the reality and to graduate with PhD without making a fuss. It seems that the more experienced people take a somewhat flexible attitude that "learn what you need to and ignore something doesn't feel right/make sense," a.k.a. look at the bigger picture, not the details or "You do not need to do the correct research, as research, by nature, is prone to error". I agree that understanding the bigger picture is important, but it seems the way research is done dismisses details often too much. I try to follow/learn this attitude, but it just seems very hard and somewhat arbitrary.

2. Due to publication pressure, I feel that there are so many meaningless papers. In fact, I also submitted two papers this year, and I am not proud of any of my work. I hardly find these can be used by practitioners. Methodologies these days are much more complicated than in the past, but I felt that it became too complicated to be actually useful in practice. My advisor seems to be satisfied with my work, but it seems that he can't understand why I am not happy. I guess I am also another one who is just merely trying to survive in this crazy "competition" instead of doing "real" and "meaningful" research.

These thoughts have negatively influenced my research work to the degree that I started to question whether this is indeed the way I want to spend the rest of my career. I was fascinated and excited by creative/beautiful ideas of bridging theories with actual data analysis or solving some real-world problem using my quantitative skills, but in reality, it seems that, by the nature of the discipline, there's a lot of darkrooms which I didn't see before. I am sure some other people in the forum have once in their life had similar feelings. I wonder how they dealt with such feelings and moved on.

Edited March 6, 2021 by Statmaniac

Stat Phd · March 6, 2021

10 hours ago, Statmaniac said:

Hi all,

I want to ask for advice from more experienced researchers in this forum on some of the ongoing thoughts I have towards ML/stat/algorithmic/computational research. As a PhD student researching machine learning/statistics, I have been overwhelmed by a chain of negative aspects I saw in the actual research phase. Below are some features I find very uncomfortable staying in these fields.

1. A significant portion of research papers are not reproducible. I understand writing a paper that perfectly explains all the tiny-gritty details of the methodology is very difficult, but at the same time, I realized that there are just way too many papers not explaining details enough for readers actually to use methods they introduced in the papers. What is worse is that a significant amount of papers are error-prone, and I have seen reviewers simply disregarding such errors treating them "trivial". An upper-year student who graduated last year had found at least three major errors in his advisor's previous paper, but his advisor ignored them. He fixed all the theories by himself, submitted the correct version of the proof, and got published with his advisor's name on it. Moreover, this was published in one of the top 4 venues, which statistics people would highly consider. His advice to me was simply to accept the reality and to graduate with PhD without making a fuss. It seems that the more experienced people take a somewhat flexible attitude that "learn what you need to and ignore something doesn't feel right/make sense," a.k.a. look at the bigger picture, not the details or "You do not need to do the correct research, as research, by nature, is prone to error". I agree that understanding the bigger picture is important, but it seems the way research is done dismisses details often too much. I try to follow/learn this attitude, but it just seems very hard and somewhat arbitrary.

2. Due to publication pressure, I feel that there are so many meaningless papers. In fact, I also submitted two papers this year, and I am not proud of any of my work. I hardly find these can be used by practitioners. Methodologies these days are much more complicated than in the past, but I felt that it became too complicated to be actually useful in practice. My advisor seems to be satisfied with my work, but it seems that he can't understand why I am not happy. I guess I am also another one who is just merely trying to survive in this crazy "competition" instead of doing "real" and "meaningful" research.

These thoughts have negatively influenced my research work to the degree that I started to question whether this is indeed the way I want to spend the rest of my career. I was fascinated and excited by creative/beautiful ideas of bridging theories with actual data analysis or solving some real-world problem using my quantitative skills, but in reality, it seems that, by the nature of the discipline, there's a lot of darkrooms which I didn't see before. I am sure some other people in the forum have once in their life had similar feelings. I wonder how they dealt with such feelings and moved on.

That’s a great point I have seen that with most conference papers. Are you publishing in conferences?

the truth is that conferences are unreliable due to the sheer volume of papers.

many of of my friends (newbie grads like you) are “reviewing” for conferences. They have no clue how to really assess the quality of paper (as they tell me personally!)

also most of these friend just want to have conference papers to put on their CV and get tech job later. And their advisors just want to grow their cv, the advisors rarely read the papers, they just write the intro and conclusion, and give a cursosry look

so it’s really depressing, I agree with you...

Stat Phd · March 6, 2021

But I think you have the right mindset, you should try do your best but still be pragmatic. Are you looking for an industry job or academic job?

bayessays · March 6, 2021

I have definitely felt this strongly and had previously had some years of a similar identity crisis before ultimately deciding to return to the statistics/data science world. Like you, I entered graduate school for the first time with a very idealistic attitude. I found the "creative/beautiful ideas of bridging theories" incredibly exciting -- when I'd learn some new piece of theory, it would fill a piece of this large puzzle in my mind thatI was trying to assemble of what would someday become the full knowledge of the field of statistics. I was succeeding in my research, but also felt like it was not very serious and that someday I would get to the "real stuff." I thought that the "real stuff" was just hidden behind a few more layers of math and foundational material that I didn't know yet, but all these problems were possible to overcome if I just learned a few more things. The people in Annals of Statistics were certainly doing the "real stuff," I thought. But I think eventually you sort of see through these things and realize that there is no "real stuff" and that statistics, for the most part, is a field with a lot of paradigms and ideas but so much complexity that you are in many ways destined to be unhappy if you have both 1) your idealism for the truth and 2) your moral qualms about the quality/arbitrariness of so much research. This is especially true in industry where you are often asked to answer impossible questions or being encouraged to produce results that your bosses prefer rather than the truth. I certainly know others who have felt this way as well.

As for how to deal with these feelings, I think your advisor has a lot of good advice. Don't view your PhD and your potential professor jobs afterwards as solely a quest to answer the deepest questions of the universe -- maybe you can do some of that, but you're in training to gain a set of skills and gain the qualifications that lead to more opportunities. Maybe you're not doing what you want, but honestly, you're probably not having a meaningfully *negative* impact on the world if you view your work as slightly shoddy. In fact, that you're even thinking about these problems means that you are useful in improving the culture of the profession, especially in industry where it is more difficult to succeed with this kind of skepticism.

In reality, what are your career options if you leave the field and how much are you willing to sacrifice? Will you discover a different set of problems there? I suspect so. I decided that even if I'm not going to come up with the perfect theories, I can still learn some interesting things and work on cool problems. I can try to stay true to my morals and not do things that are actively harmful. Being in academia lets me teach. This forum allows me to try and help people in the field. And I get to make a living (and a very good one at that) doing something that I have already invested a lot of time in and have a lot of knowledge in, and I don't think it's worth throwing that away because it's not exactly what I expected.

Euler17 · March 7, 2021

In relation to you not being proud of your recent papers, remember that you probably know the shortcomings/holes in your work better than anyone else. I have struggled with being completely unsatisfied with 2 of the papers I wrote this year because I felt like there were so many ways that they could be improved, but I trusted my professors that they were ideas worth writing about. And maybe I don't get around to improving upon them in later papers the way I envision, but once they are out there others have the opportunity to build on them if they'd like. I def agree with most of the issues you bring up, just wanted to give my two cents on that point.

Sign In

Seeking for advice on rather sensitive reality of ML/stat/Data related research

Recommended Posts

Statmaniac

Link to comment

Share on other sites

Stat Phd

Link to comment

Share on other sites

Stat Phd

Link to comment

Share on other sites

bayessays

Link to comment

Share on other sites

Euler17

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Results

Important Information