The organization of slides should be hierarchical. By only reading the titles on each slide, the reader should capture the whole story.
Use as many pictures, animations, videos as possible. These are called hooks, which are more enticing than text and formulas.
Do not stuff too many words, or even worse, too many formulas on each slide. One point per slide and one slide per point.
Unify the font, color, etc. At most three different colors (e.g., pure blue, pure red, pure black), one font type (e.g., Times New Roman (good for PDF), Arial) and three different font sizes (e.g., 32, 24, 18) to distinguish title, subtitle, and context.
For non-professional readers, use as few jargons as possible. Make sure each jargon should be used before well defined. Try to replace jargons with plain words.
Hyper Parameter Tuning
Parameter searching:
grid search
Bayesian Optimization: SigOpt is providing parameter tuning service based on Bayesian optimization. This is a good place to start.
Spectral Optimization: Harmonica [github] currently only supports tuning binary parameters.
Model ensembling:
- different initialization
- different epochs
- different hyperparameters
Resources
A toolkit sealing all types of parameter tuning methods: Google Vizier
How to Conduct Experiments
Experimental setting:
- Select more than 3 popular benchmark datasets. Most papers perform evaluation on 3~5 datasets.
- Determine the training/testing split. By default, strictly follow the setting in recent papers. If their settings are not applicable, we can determine our own setting reasonably and fully describe the details in the paper.
Experimental implementation:
- For the commonly used model, avoid unnecessary reimplementation. Just use the released code and model.
- If closely related papers release their code, modify based on their code and avoid starting from scratch.
- Document your code with readable and meaningful comments. Otherwise, even your cannot understand your own code after a long time.
- Manage different versions of code carefully, otherwise your code will get messy very rapidly. Use advanced tool such as github to create branches, revert to old version, etc. The brutal approach is to create an individual folder for each version, which is not recommended.
Experimental running:
- Avoid redundant or unncessary experiments.
- Run experiments parallelly using all available computing resources.
- Tune hyper-parameters wisely.
- Save final output and important intermediate output for future analysis.
Experimental record:
- Make a to-do list for what experiments you are going to do.
- For the experiments you have done, summarize your observation and conclusion. Then, adjust the remaining to-do list accordingly.
How to Come up with Ideas
Non-trivial ideas require deep understanding and thinking, solid math background, excellent engineering skills, and sometimes good luck. Trivial ideas are much easier with the aid of some short formulas. We do not encourage students to generate trivial ideas based on short formulas, but this can help beginners take the first step and gain confidence.
naive combination: e.g., combine regularizers, combine network modules
kernelize: project from low dimension to high and even infinite dimension, which is very popular before the deep learning era.
from single to multiple: create multiple-XXX learning setting, e.g., multi-view learning, multi-instance learning, multi-label learning, multi-task learning.
separate common and specific components: in multi-XXX learning setting, we can separate XXX-invariant components from XXX-specific components. e.g., for multiple domains (resp, categories), we can separate domain (resp., category)-invariant components from domain (resp., category)-specific components.
combine local information with global information: jointly use local and global information from images/videos.
from discrete to continous (e.g., using integral) or from continuous to discrete (e.g., using basis).
from coarse-grained to fine-grained: generate a coarse-grained result first and then refine the coarse-grained result to fine-grained result.
from simple structure to advanced structure: e.g., from sequence network to tree network to graph network, form vector representation to matrix representation to tensor representation.
introduce auxiliary/side information to help the original task: the auxiliary/side information is usually more accessible.
learn different weights on different components: the essence of machine learning is learning weights, e.g., assign different weights on different feature dimensions (linear classifier), on different kernels (multiple-kernel learning), on different training samples (sample reweighting).
fill in the table: analyzie the existing methods and summarize them in one table. Then, find the hole in the table and fill it with your own method.
My previous research works can be organized into the following groups:
- data-centric: how to obtain training data
- input format: e.g., color space, spatial/frequency domain
- output format: redefine the output format
- basic model: upgrade CNN to transformer or diffusion model
- essential math: extract and solve the essential math problem
Fair comparison
Carefully compare your experiments settings with those described in previous papers. The experimental settings include training/testing split, input format (e.g., image size), evaluation metric, backbone network, etc. If they are not exactly the same, you need to re-run their methods with exactly the same experimental settings as yours for fair comparison. For example, if the baseline uses 32x32 image size while you use 244x244 image size, that is unfair comparison. If the baseline uses AlexNet as backbone while you use ResNet as backbone, that is unfair comparison.
The baseline methods may use more or less information, compared with your method. Based on using more/less information and their performance is better/worse than yours, we could have the following table.
- If the baseline uses more information but achieves worse results, never mind. Just brag your own method.
- If the baseline uses less information but achieves better results, it is a serious problem. You have to check your method.
- If the baseline uses less information and achieves worse results, that is reasonable. But you may be accused of unfair comparison because your method uses more information. For safety, please augment the baseline with extra information and compare again.
If the baseline uses more information and achieves better results, that is reasonable. But do not casually put them in the paper because dumb reviewers may ignore these details and point out your method is not good enough. That is very risky! Please remove the extra information from the baseline and compare again.
Entering a New Research Field
See the large picture
- Search keyword + ‘survey/tutorial’ in google/google scholar. For example, if you want to work on zero-shot learning, search ‘zero-shot learning’ + ‘survey/tutorial’.
- Find recent survey/tutorial on top conference/journal or conjunct workshop. You need to know the list of top conferences/journals in the related field (e.g., computer vision, machine learning).
Know the state-of-the-art
- Search keyword in the most recent top conferences (e.g., CVPR2018). Journal is generally lagging behind the conference, so focus on conference papers. You collect the paper list and will have a sense whether this topic is popular or crowded (e.g., only 1 paper or more than 20 papers on one conference).
Summarize existing methods
- You need to collect a paper list. Initiate the paper list following Step 2. Read the related works in these papers and add them to the paper list. Repeat this procedure to enrich the paper list. Try to categorize the used methods in these papers based on certain taxonomy.
Ensure the Novelty of Your Idea
There is no automatic system that can tell you whether your idea has been done or the similarity between your idea and the most similar previous work.
Search all possible keywords (e.g., synonym) in Google/Google Scholar.
If there exist old and recent surveys, carefully read them. Otherwise, make a survey on your own.
Quickly go through previous works to ensure that your idea has not been done. You do not need to understand the details of their methods. Actually, telling whether their methods are close to yours is much simpler. For instance, you can just pay attention to the most obvious parts (e.g., flowchart, objective function, and abstract) in their papers.
Through the above steps, identify the most related work, think about the difference and your advantage. If you cannot clearly state the difference and advantage of your idea, do not work on it.
Dictionary for Paper Writing
Between Logical Blocks
因果: because, since, as, so, consequently, therefore, thus, hence
转折: but, however, nevertheless, nonetheless, despite, whereas, although, while, albeit
结果: generate, produce, yield, lead to, result in, give rise to
递进: besides, furthermore, moreover
顺承: following, followed by, prior to, in the wake of
相反: on the contrary, in constrast with, in opposition to, as opposed to, vice versa, the other way around
换言之: in other words, that is, that being said
具体来说: particularly, in particular, specifically, to be exact, in detail, concretely
Within Logical Blocks
优势:advantage, benefit
劣势:disadvantage, drawback, shortage, shortcoming, defficiency
超过: outperform, exceed, surpass
基于: based on, according to, on the basis of, in the light of, on the premise of
除了: besides, apart from, aside from
促进: facilitate, fuel, advance, benefit, aid in, assist in, spur
损害: degrade, impair, hinder, eliminate, compromise, harm, hamper
解决: address, solve, handle, mitigate, tackle, cope with, overcome, circumvent, bypass
缓解: alleviate, suppress, mitigate, assuage, relieve, ameliorate
介绍: introduce, describe, discuss, elaborate, review
展示: demonstrate, show, indicate, exhibit, display, illustrate
验证: prove, justify, verify
怀疑: cast doubt on
使用: use, utilize, employ, leverage, harness
代替: instead of, in lieu of, alternative, surrogate, supersede, replace
相似: similar to, analogous to, in analogy to, resemble, bear resemblance to, is reminiscent of, akin to
包含: contain, include, be composed of, be comprised of, consist of, be formed by
到目前为止: up to now, so far
重要的: important, significant, crucial, vital, critical
明显的: notable, evident, pronounced
大量的: considerable, massive, vast, myriads of, a variety of, a wide range of, substantial, adequate, plenty of, a plethora of, unprecedented
很大程度: dramatically, greatly, significantly, considerably
必需的: necessary, imperative, demanding, in high demand
良好的: superior, favorable, compelling, competitive, remarkable, excellent, impressive
有害的: harmful, detrimental
流行的: popular, prevailing, prevalent, attractive, enticing, inviting, dominant
差不多的: comparable, on par with
鲁棒的: robust, agnostic, insensitive, insusceptible
敏感的: sensitive, susceptible, brittle
适合的: well-suited, suitable, applicable, well-tailored
艰难的: tough, challenging, formidable
笨拙的: cumbersome, unwieldy
Determine baselines
Generally speaking, for the baselines, you need to compare with other (i.e., state-of-the-art) and compare with yourself (i.e., component analysis or ablation study).
Compare with others
How to select baselines?
- Select the baselines from top conferences. You can refer to related paper published on recent top conferences and find out which baselines they compare with. The intersection of their baselines should be the most popular ones.
- The selected baselines should be discussed in the related work in your paper.
- The selected baselines should cover at least several ones from the most recent top conferences.
- The selected baselines should cover the researchers who are very famous in this field or has many publications in this field. If you do not cite his/her paper and your paper unfortunately goes under his/her review, then you are doomed.
Could we directly copy the results from previous papers?
- Carefully compare your experiments settings with those described in previous papers. The experimental settings include training/testing split, input format (e.g., image size), evaluation metric, backbone network, etc. If they are exactly the same, just copy the results.
- Otherwise, you need to re-run their methods with exactly the same experimental settings as yours for fair comparison.
Need we implement the baselines?
- Search the code online and contact the authors for code.
- If you could not get the code, you need to implement the baseline by yourself according to the details provided in the paper. Theoretically, it is impossible to completely re-implement the baseline unless the method is frustratingly easy (e.g., 10 lines of matlab code), so just follow your understanding and implement a reasonable version.
Compare with yourself
Why is it necessary?
- Because you need to understand which component of your method really works.
- If you do not compare with yourself, you provide a perfect reason for reviewers to reject your paper.
How many special cases do we need?
- That mainly depends on the technical contribution of your paper. If you claim regularizer XXX or strategy XXX or subnetwork XXX is proposed by yourself and very effective, you have to verify that in the experiments.
- For some naive special cases, you may just need to set certain hyper-parameter as 0 or freeze some components in your network, so the experiments will be quite simple. For other advanced special cases, that will take some more work.
Checklist
Please select and compare with baselines meticulously. You can summarize and check your baseline information in the following table:
Conference Rebuttal
To determine whether the rebuttal is necessary. If the scores are too low, just give up.
Summarize the questions from reviewers and rank them based on the importance.
Pick out the questions which call for experiments and conduct those experiments immediately.
When running experiments, draft the response file. Pay attention to the format of response file (e.g., limited characters or one page of pdf, URL allowed or not).
In the response file, try to cover all the questions if possible. Otherwise, igore the questions with least importance.
The tone of response file cannot be rude and offensive. Be cool and confident.
If the maximum number of characters is quite limited, there are many tricks to save space.