The four pillars: Computational sciences are (1) reading, (2) writing, (3) communicating and (4) coding/experiments; in this order of importance. Maximize your time spent on (1,2,3) to minimize time spent on (4).
Read: Topic papers, off-topic papers, landmark papers, textbooks. Learn by reading bad papers (that is, ask your supervisor for review work). You need to become world’s top expert in your phd topic during the four year; plan accordingly and read continuously. Do not stop reading when you hit problems with your experiments, instead start reading even more: all ML problems have already been solved by someone in some paper (almost surely). Make sure to read every week, and at least 20% of your working time.
Write: Formalize your ideas and models mathematically and precisely in latex. Written communication is the backbone of science, and writing forces you to conceptualise and clarify your thoughts. Aim at publication style. If you hit a problem in your experiments, spend more time making your assumptions, hypotheses, models, background, context, motivation, related works, etc. more precise.
Communicate: Prepare presentation slides for every meeting, no matter how casual: Your collegues and supervisors will appreciate this! Aim at conference presentation quality. Practising presentations from the very beginning is helpful, since it forces to conceptualise your work. Remember that your supervisors work on 20 other projects and need context for every meeting.
Run isolated experiments: tweak only few/one things at a time to narrow down the cause and effect. Find the “bedrock” first: the simplest sensible baseline and verify it works. Keep adding things one by one to see when things break down. Identify the problem you want to solve in the baselines, and make sure it exists. If you get stuck, avoid temptation to spend more time running experiments. Instead start reading, writing and discussing more to clarify that the problem is true, and solution is correct.
Don’t chase SOTA: benchmark results are not scientifically interesting. Instead aim at finding qualitative improvements, gaps in literature, or insight behind SOTA. These often come from understanding related works and your model more in-depth.
Solutions are cheap, problems are golden. Any ML problem can be solved by anyone. Instead of finding solutions, focus on finding problems that are important and true. Focus your time on understanding the problem, the solution will then emerge.
Don’t hide from your supervisors. We love to talk about science, we love to be challenged and proven wrong, we love to hear about your ideas and progress. If you spend a week reading, don’t say that “I have no new results”; you have made lots of progress by learning new things, describe this progress. Keep (i) a research diary, (ii) literature review, (iii) technical project report. Share this as a single-click url for your supervisors.
Network and socialise: attend NIPS and ICML every year, even if you have no paper to submit. Attend journal clubs. Make sure you have online presence (website and github) so that collegues and bigshots can find you online.
Follow the domain: Check all oral/highlight papers in top conferences (NeurIPS, ICML, ICLR, AISTATS, etc) to see where the field is moving. Follow what your competitors are publishing. Use Google Scholar to follow seminal paper and their forward-citations.
Attend summer schools. During your phd it is useful to attend one or two different schools, around halfway through your phd. Great ones are MLSS.cc, DeepBayes.ru, SMILES, DeepLearn, DLRL.ca, ProbAI.
Love what you do. Move towards projects that interest you. This is your phd thesis and career, you need to drive it forward. Yet, commit to and finish your projects, regardless if they interest you.
If you feel stuck: slow down, rethink what you are doing, and discuss with your collegues: what problem are you solving and is it the right problem?
Organize your time. Make sure to spend ~20% of your time reading and another ~20% writing.
Study math. You want to understand linear algebra and vector calculus, probability and statistics and Bayes, differential and integral calculus. You will also benefit from measure theory, functional analysis, differential geometry and complex analysis.
Read textbooks. Courses, Wikipedia or blogs do not give deep understanding, and you want to have read at least few serious textbooks cover-to-cover. Start by reading Deisenroth’s book and one probabilistic ML book, and continue to other books. Great books are:
On probabilistic learning: Murphy’s “Probabilistic machine learning”, Barber’s “Bayesian reasoning”, Bishop’s “Pattern recognition and machine learning”
On learning theory: Mohri “Foundations of machine learning”, Swartz “Understanding Machine Learning"
On information theory: MacKay’s “Information theory, Inference and Learning Algorithms"
On statistical learning: Tibshirani’s “Elements of Statistical Learning"
On Bayesian modelling: Vehtari’s “Bayesian Data Analysis"
On deep learning: Goodfellow’s “Deep learning"
There are many research guides