Jump to content

vijay120

Members
  • Posts

    4
  • Joined

  • Last visited

Everything posted by vijay120

  1. This is my first draft from CS PhD at UW. I plan to use this template across all the other colleges I apply (I know this is not kosher but I am on a time crunch). Thanks in advance for reviewing it! Here is the prompt: Submit a personal statement of ~1000 words (max 500KB) that includes: a) how you became interested in doing research, a relevant project or research experience that shows your technical knowledge and skill, and c) your plans for the future in computer science. You may wish to include information about what you feel are the strengths of your application, such as special interests and abilities, or give explanations for what you feel are any weaknesses in your academic record. Statement of Purpose My primary research objective is using Machine Learning (ML) techniques in the field of computational biology. I am currently one of the three founding engineers of 20N Labs, a Y-Combinator and Khosla investments backed, computational biotech startup founded by Dr. Christopher Anderson, associate professor of bioengineering at UC Berkeley and Saurabh Srivastava, PhD in Computer Science at University of Maryland. I believe the most important ML investments in computational biology would be in model inference, reducing the large hypothesis space and instrument-agnostic modelling techniques for large, heterogeneous biological datasets. Through my experience building out ML models and graph algorithms for untargeted metabolomics at 20N, I want to advance the state of the art in ML techniques for computational biology. I plan to continue research after my Ph.D., as a faculty in academia or as a researcher in an industrial lab. 1 Research While I always had an interest in computer science research at Harvey Mudd College and in my early software engineering career, I was initially focused on building reliable, large scale software systems for enterprises. It was only after I decided to follow my curiosity in recent advancements in ML techniques for protein folding prediction, drug discovery and advancements in experimental techniques like CRISPR-Cas9 that I switched to a research engineer role at 20N. At 20N, I closely worked with Dr Anderson and Dr Srivastava to develop algorithms for untargeted metabolomics. 1.1 Peak Detection for Untargeted Metabolomics 20N uses Liquid Chromatography Mass Spectrometry (LCMS) instruments to detect chemicals in biological samples. The instrument produces a 3-Dimensional dataset of mass charge, retention time and Intensity values for each for these samples. I developed a targeted metabolomics pipeline to analyze the LCMS output. Here we took a list of molecules we suspected were in the sample and tried to detect peak patterns in the data. Since it was not scalable for a research assistant to look for thousands of peaks in our scans, I developed a signal processing algorithm that used a peak template as a feature detector to find "peak-like" patterns in our scans. This algorithm had a linear runtime, which allowed us to scale on our massive datasets of chemicals and samples. Next, I developed algorithms for the company's untargeted metabolomics pipeline. One part of this problem is to detect 3D peaks in the input dataset without any prior chemical targets to look for. Using instrument data artifacts called "Data Voids" that are found near peaks, I used k-means clustering techniques to find points of interest in the data. I then implemented a continuous wavelet transform function that could reliably detect chromatographic peaks of differing width. Using these two approaches, we developed a pipeline that could detect peak artifacts without an input chemical dataset. Finally, I developed unsupervised deep learning models for chromatographic peak detection and mass spectrometry retention time prediction. Since our lab did not have large data sets of labelled data, I used autoencoders to learn the underlying structure of peaks vs noise in our LCMS data (followed by a human tagging clusters that looked like peaks) and retention time buckets based on chemical structure (followed by implementing a supervised learning algorithm trained on in-house chemical retention times). 1.2 Graph Network Analysis 20N had a repository of reaction operators, functions that operate on input substrates and transform them to their correct products. These reaction operators were curated from our understanding of biochemical reaction mechanisms. Using our peak detection algorithms, if we detected peaks from the substrates and its derivatives using our reaction operators, we have more confidence that we detected the substrate molecule. I developed algorithms that used Prize Collecting Steiner Trees (PCST) to extract high-value networks from our large datasets that were analyzed using the peak detection algorithms. These high value networks gave us a better understanding of the biochemical pathways that were over/under expressed in our samples. 1.3 Human sample analysis I conducted statistical analysis from our implemented computational methods on diseased/non-diseased human samples given by clients. I learnt to develop effective experiments to test the efficacy of our algorithms on these samples. This involved asking the right questions, researching their answers through data analysis and presenting the results in a manner that was easily understood by the team. On the way, I got better at data visualization using tools like Pandas and R. 2 Teaching and Mentorship I TA-ed Operating Systems (CS134), taught by Prof Neil Rhodes and Data Structures and Algorithm Development (CS70), taught by Professor Melissa O'Neil. I also mentored aspiring women engineers through the Hackbright fellowship program in San Francisco. 4 Related Courses I took Machine Learning and Algorithms Development at Harvey Mudd. For my continual education, I took certificate courses in Bioinformatics from UCSD (Bioinformatics I, II and III) and am also taking a certificate course in Biochemistry from UC Berkeley to better understand the domain space. 5 Conclusion I am very interested in Professor Su-In Lee’s work with the ENCODE project to analyze ChiSeq data for detecting gene network perturbations in cancer. The DISCERN algorithm for comparing expression levels between diseased and wild type tissue and scoring them has a lot of similarities to the approach I took in analyzing metabolomics data between disease vs wild type tissue at 20N Labs. Since I thoroughly enjoyed researching and analyzing samples in the project, I hope to bring that enthusiasm and scientific rigor to Professor Lee’s work. Professor William Nobel’s work in proteomics and mass spectrometry analysis also supports my research interests and prior field experience. Professor Nobel’s work on Percolator 3.0, a ML model for protein identification and his findings in “Tandem mass spectrum identification via cascade search” are the types of research methods I want to improve on. Having spent time at Mudd and 3 years in industry, I now want to pursue a PhD as it would allow me to do more focused research. In my research, I want to advance the state of the art in ML and AI techniques being used to solve computational biology problems.
  2. Some people believe that corporations have a responsibility to promote the well-being of the societies and environments in which they operate. Others believe that the only responsibility of corporations, provided they operate within the law, is to make as much money as possible. Write a response in which you discuss which view more closely aligns with your own position and explain your reasoning for the position you take. In developing and supporting your position, you should address both of the views presented. The extreme positions of corporations being perfectly altruistic on one side and being perfectly selfish on the other is problematic since such practices are not sustainable. One has to only look at the public outcry created after studies found that companies using CFCs in their aerosol products created a massive ozone hole in the Antarctic to see that being perfectly selfish is unsustainable. Moreover, being perfectly benevolent by contributing to society and the environment while not generating profits to shareholders will leave investors running with their money. Modern corporations know this and have struct a middle ground; they have started to inculcate corporate social responsibility (CSR) projects into their operations, where a dedicated department works on beneficial social and environmental projects while the rest of the firm generates profit. However, this is problematic since CSR projects inherently create tension between maximizing value to the shareholder and being benevolent to society and the environment. I propose an alternative that is much more effective than CSRs: adding societal and environmental benefits into shareholder value. An example of this is Danone corporation, a multinational corporation that specializes in making yogurt and other diary products. It recently set up research facilities in Bangladesh to create a yogurt product that was enriched with nutrients so that a child could survive on the yogurt for a day without getting malnourished. They used beggars as their distribution channel since they intimately knew who needed this product the most and Danone provided financing to buy the yogurt. By doing this, the company was able to find the correct combination of ingredients it needed to sell this type of yogurt to its more higher profit margin markets in Europe and the Americas. At the same time, the yogurt benefitted hungry children in Bangladesh. Such a scheme shows how corporations can add societal benefits to their shareholder value. Corporations can also market their social/environmental efforts directly to consumers, thereby distinguishing their product in a commoditized space and making them more valuable. For example, organic chicken farms advertise that their chickens were feed with fertilizer free food, were allowed to roam freely and were not confined in a cage to appeal to the socially/environmentally conscious consumer. This in-turn helps them profit by selling a pricier product than farms than raised caged chickens. In conclusion, the statement of choosing the extremes of pure profit or benevolence is impractical and problematic. Even though companies currently follow CSR programs, a more effective alternative is incorporating societal and environmental factors in the heart of their business model, so that no conflict of interests arise.
  3. "The best way to teach is to praise positive action and ignore negative ones" Write a response in which you discuss the extent to which you agree or disagree with the recommendation and explain your reasoning for the position you take. In developing and supporting your position, describe specific circumstances in which adopting the recommendation would or would not be advantageous and explain how these examples shape your position. -------------------------------------------------------------------------------------------------------------------------------------------------------- Teaching is well regarded as a nobel profession that shapes the lives of our future citizens. This is especially true during the middle school years, when the developing brain of a teen is being especially influenced by its environment, which primarily includes the family and the school environment. Therefore, it becomes a very important concern to effectively teach our children good habits and skills that can be brought with them into the future. There has been a widespread understanding that the more effective way to teach our children is to praise their positive actions and ignore their negative ones. However, such a thesis is problematic in two ways. First, children need to learn from their negative actions so that they would not repeat them again. The way to teach this is by pointing out mistakes clearly and then asking them to correct it. For example, researchers studying the California education system found that the learning in a single class session diminishes by 15% if the student attends 10 minutes late in a standard 45 minute session. Therefore, teaching the student to be punctual to class will increase his/her learning, which cannot be happen if the student’s negative action is ignored. Second, "praising positive actions" is a overly generalized statement that needs to be qualified. Depending on the type of school the student is attending, “praising” a student’s action can be detrimental to his/her overall development in the classroom. In magnet Math and Science schools in the United States, there is a strong academic culture of excellence that is ingrained the psyche of most of students, especially in their social circle. Therefore, openly praising academically strong students in class will encourage them to do better and also spur other students to achieve similar feats of their peers. However, in community schools in lower income neighborhoods, such an open praise strategy would most likely backfire since the culture of the teens, especially the social aspect of it, is shaped not only by academics, but also by popularity and sports. In this setting, openly praising academically strong students would likely have them maligned by their peers. A more cautious strategy has to be applied in such school cultures. In conclusion, tight feedback loops are very important in developing teenagers. However, teachers have to look at all the incentives, not just academic ones, that drive a student’s progress in school before they institute any strategy like parsing only positive action and ignoring negative ones.
  4. To understand the most important characteristics of a society, one must study its major cities. Societies are a manifestation of collective human behaviors. Since cities comprise of millions of people living and working together, the most important characteristics of societies are represented in these places. Three main features are characteristic of any society: its economic, social and religious landscapes. First, the economic model of a society is deeply rooted in the cities people live in. The five $0.99 stores, seven Starbucks, four Walgreen pharmacies and a superstore called Kroger that has everything from organic vegetables to toilet paper are all located in a two mile radius of Elmwood, Chicago. They are a representation of Capitalism and its free market ideology, where competition between companies is encouraged. In contrast, the rationing of food in downtown Havana, Cuba by government officials is a sharp contrast to Chicago. Here, the ideology of socialism that is embedded into the social psyche of the population has been Cuba one of the last remaining vestiges of socialism on the globe. Here, competition among companies is frowned upon. Such economic models define how cities are built and are a reflection of the societies that live in them. Moreover, the economic philosophies of societies closely mandate the social models of nations. Stockholm, Sweden has a world class healthcare policy that is mandated by the government to be free to all its citizens. Diagnostics in every hospital in the capital is free of charge and companies are required by law to provide women with paid leave during pregnancy. The huge safety net is in part a reflection of the strong familial ties in tradition within swedish families. This is in contrast to the small social safety net in Detroit, Michagan, where the ideas of American individualism is still flourishing. This shows how social structures that are ingrained in societies carry on to city policies. Third, the spiritual models of cities are also a reflection of their societal structures. For example, the Ganapati Puja is a widely revered pagan festival in Mumbai, India where the elephant god is worshipped for protecting the soul from evil spirits. This translates to work life, where the God Ganapati is prayed upon before any business deal goes through. This is in sharp contrast to the practice of Shitoism in Japan, where people do not have a worship to a pagan god, but have a deep loyalty and respect to their ancestors. At work, senior management and the older generation areheld with deep respect. This again is a reflection of how societies manifest their features in cities. In conclusion, many important features in society carry forward to cities since cities are where people live in and therefore, display their behaviors and lifestyles.
×
×
  • Create New...

Important Information

This website uses cookies to ensure you get the best experience on our website. See our Privacy Policy and Terms of Use