Newly Blog


  • Home

  • Tags

  • Categories

  • Archives

  • Search

Mutual Information

Posted on 2022-06-16 | In paper note
  1. [1]: use KL divergence as the upper-bound of mutual information (MI), which can be used to minimize MI. r(z) can be set as unit Gaussian for simplicity.
  1. MINE[2]: lower-bound of MI based on KL divergence. Due to strong consistency, MINE can be used as a tight estimation of MI.

References

  1. Alemi, Alexander A., et al. “Deep variational information bottleneck.” arXiv preprint arXiv:1612.00410 (2016).

  2. Belghazi, M. I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville, A., & Hjelm, D. Mutual information neural estimation, ICML, 2018.

Meta Learning

Posted on 2022-06-16 | In paper note

Taxonomy

1) metric-based: learn a good metric

  • matching network [1]
  • relation network [2]
  • prototypical network [3] [4]

2) optimization-based: gradient

  • Meta-Learner LSTM [5]
  • MAML [6] [7] [8]
  • REPTILE (an approximation of MAML) [9]

    Optimization based methods aim to obtain good parameter initilization. If we simply train multiple tasks, the obtained model parameters may lead to sub optimum for each task.

3) model-based: predict model parameters

  • MANN [10]
  • MetaNet [11]

Reference:

  1. Vinyals, Oriol, et al. “Matching networks for one shot learning.” NIPS, 2016.
  2. Sung, Flood, et al. “Learning to compare: Relation network for few-shot learning.” CVPR, 2018.
  3. Snell, Jake, Kevin Swersky, and Richard Zemel. “Prototypical networks for few-shot learning.” NIPS, 2017.
  4. Ren, Mengye, et al. “Meta-learning for semi-supervised few-shot classification.” arXiv preprint arXiv:1803.00676 (2018).
  5. Sachin Ravi and Hugo Larochelle. “Optimization as a Model for Few-Shot Learning.” ICLR, 2017.
  6. Chelsea Finn, Pieter Abbeel, and Sergey Levine. “Model-agnostic meta-learning for fast adaptation of deep networks.” ICML, 2017.
  7. Finn, Chelsea, and Sergey Levine. “Meta-learning and universality: Deep representations and gradient descent can approximate any learning algorithm.” arXiv preprint arXiv:1710.11622 (2017).
  8. Grant, Erin, et al. “Recasting gradient-based meta-learning as hierarchical bayes.” arXiv preprint arXiv:1801.08930 (2018).
  9. A. Nichol, J. Achiam, and J. Schulman. On first-order meta-learning algorithms. arXiv, 1803.02999v2, 2018.
  10. Adam Santoro, et al. “Meta-learning with memory-augmented neural networks.” ICML. 2016.
  11. Munkhdalai, Tsendsuren, and Hong Yu. “Meta networks.” ICML, 2017.

Tutorials:

  • https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html

  • http://metalearning-symposium.ml/files/vinyals.pdf

  • https://github.com/floodsung/Meta-Learning-Papers

  • A more comprehensive survey: https://arxiv.org/pdf/1810.03548.pdf

Lifelong Learning

Posted on 2022-06-16 | In paper note

Related concepts: online learning, incremental learning, continual learning

Survey

  • Continual lifelong learning with neural networks: A review
  • A continual learning survey: Defying forgetting in classification tasks
  • Class-incremental learning: survey and performance evaluation

Resources

  • Stanford course

Latent Variable Regularization

Posted on 2022-06-16 | In paper note

Regulating latent variables or latent features can improve the generalizability of classifier and lower the error bound.

Regulating latent variables is essentially decrease the entropy of latent variables. There are some common tricks to decrease the entropy of latent variables, for example,

  1. dropout
  2. weight decay
  3. add random noise to the latent variables in VAE and GAN.
  4. add random perturbation to model parameters

For theoretical proof, please refer to here.

Vote Aggregation

Posted on 2022-06-16 | In paper note
  1. label smoothing: [1] interpolating ground-truth label and uniform label

  2. bootstrapping: [2] interpolate noisy label and label from previous iteration

  3. noisy data+clean data: [3] interpolate noisy label and distilled label

[1] Szegedy, Christian, et al. “Rethinking the inception architecture for computer vision.” CVPR, 2016.

[2] Reed, Scott, et al. “Training deep neural networks on noisy labels with bootstrapping.” arXiv preprint arXiv:1412.6596 (2014).

[3] Li, Yuncheng, et al. “Learning from noisy labels with distillation.” ICCV, 2017.

Knowledge Graph

Posted on 2022-06-16 | In paper note

Definition: entities, attributes, and relationships

Two ways to construct knowledge graph:

  1. probabilistic models (graphical model/random walk)

  2. embedding based models

Incremental SVM

Posted on 2022-06-16 | In paper note

1) Approximate incremental SVM: pass through the dataset many times

  • Pegasos: select a training batch in each iteration

    • python: https://github.com/ejlb/pegasos
         https://github.com/avaitla/Pegasos
      
    • C: https://www.cs.huji.ac.il/~shais/code/index.html
    • matlab: https://www.mathworks.com/matlabcentral/fileexchange/31401-pegasos-primal-estimated-sub-gradient-solver-for-svm?focused=5188208&tab=function
  • sklearn.linear_model: SGD

    1
    2
    3
    4
    5
    6
    clf= sklearn.linear_model.SGDClassifier(learning_rate = 'constant', eta0 = 0.1, shuffle = False, n_iter = 1)
    # get x1, y1 as a new instance
    clf.partial_fit(x1, y1)
    # get x2, y2
    # update accuracy if needed
    clf.partial_fit(x2, y2)

2) Exact incremental or decremental SVM: only pass through the dataset once

  • Incremental and Decremental Support Vector Machine Learning
    http://www.isn.ucsd.edu/svm/incremental

  • SVM Incremental Learning, Adaptation and Optimization: extend the work above
    matlab: https://github.com/diehl/Incremental-SVM-Learning-in-MATLAB

  • Incremental and decremental training for linear classification: extension of liblinear focusing on linear problem
    http://www.csie.ntu.edu.tw/~cjlin/papers/ws/index.html

Implicit Modelling

Posted on 2022-06-16 | In paper note
  1. simulates an infinite-depth network by fixed point iteration $h=f_{\theta}(h;x)$, in which $x$ is initial input, $\theta$ is the model parameter of one-time transformation. After infinite times of transformations, $x$ will approach the fixed point $h$. DEQ[1], MDEQ[2], iFPN[3]

Reference:

  1. Bai, Shaojie, J. Zico Kolter, and Vladlen Koltun. “Deep equilibrium models.” Advances in Neural Information Processing Systems. 2019.
  2. Bai, Shaojie, Vladlen Koltun, and J. Zico Kolter. “Multiscale deep equilibrium models.” arXiv preprint arXiv:2006.08656 (2020).
  3. Wang, Tiancai, Xiangyu Zhang, and Jian Sun. “Implicit Feature Pyramid Network for Object Detection.” arXiv preprint arXiv:2012.13563 (2020).

Image Matching

Posted on 2022-06-16 | In paper note

Survey

  • Image Matching from Handcrafted to Deep Features: A Survey

Deep learning methods

  • Correlation tensor: [1] [2] [3]

Reference

  1. Rocco, Ignacio, et al. “Neighbourhood consensus networks.” arXiv preprint arXiv:1810.10510 (2018).
  2. Rocco, Ignacio, Relja Arandjelovic, and Josef Sivic. “Convolutional neural network architecture for geometric matching.” CVPR, 2017.
  3. Chen, Jianchun, et al. “Arbicon-net: Arbitrary continuous geometric transformation networks for image registration.” NIPS, 2019.

Image and Video Proposals

Posted on 2022-06-16 | In paper note

Image proposals:

  • Selective search [1]: hierarchical grouping based on different similarity metrics [code]

  • Salient object detection [2]: identify the segment which is easy to compose from itself but hard from remaining parts of the image.

  • EdgeBox [3]: identify the boxes that tightly enclose a set of edges are likely to contain an object.

  • ACF detector [4]: compute gradient histograms on image pyramids

  • Region Proposal Network (RPN) from faster-RCNN [5]

Video proposals:

  • Video edgebox [1]: an extension of EdgeBox

  • RC3D [2]: an extension of RPN

[1] Zhu, Wangjiang, et al. “A key volume mining deep framework for action recognition.” CVPR. 2016.

[2] Xu, Huijuan, Abir Das, and Kate Saenko. “R-c3d: Region convolutional 3d network for temporal activity detection.” ICCV, 2017.

Reference

1…111213…24
Li Niu

Li Niu

237 posts
18 categories
112 tags
Homepage GitHub Linkedin
© 2025 Li Niu
Powered by Hexo
|
Theme — NexT.Mist v5.1.4