Knowledge Graph
Definition: entities, attributes, and relationships
Two ways to construct knowledge graph:
probabilistic models (graphical model/random walk)
embedding based models
Definition: entities, attributes, and relationships
Two ways to construct knowledge graph:
probabilistic models (graphical model/random walk)
embedding based models
1) Approximate incremental SVM: pass through the dataset many times
Pegasos: select a training batch in each iteration
https://github.com/avaitla/Pegasos
sklearn.linear_model: SGD
1 | clf= sklearn.linear_model.SGDClassifier(learning_rate = 'constant', eta0 = 0.1, shuffle = False, n_iter = 1) |
2) Exact incremental or decremental SVM: only pass through the dataset once
Incremental and Decremental Support Vector Machine Learning
http://www.isn.ucsd.edu/svm/incremental
SVM Incremental Learning, Adaptation and Optimization: extend the work above
matlab: https://github.com/diehl/Incremental-SVM-Learning-in-MATLAB
Incremental and decremental training for linear classification: extension of liblinear focusing on linear problem
http://www.csie.ntu.edu.tw/~cjlin/papers/ws/index.html
Reference:
Selective search [1]: hierarchical grouping based on different similarity metrics [code]
Salient object detection [2]: identify the segment which is easy to compose from itself but hard from remaining parts of the image.
EdgeBox [3]: identify the boxes that tightly enclose a set of edges are likely to contain an object.
ACF detector [4]: compute gradient histograms on image pyramids
Region Proposal Network (RPN) from faster-RCNN [5]
[1] Zhu, Wangjiang, et al. “A key volume mining deep framework for action recognition.” CVPR. 2016.
[2] Xu, Huijuan, Abir Das, and Kate Saenko. “R-c3d: Region convolutional 3d network for temporal activity detection.” ICCV, 2017.
A survey on graph similarity [4]
pooling/unpooling [5]
GNN for zero-shot learning [1][2]: treat each category as a graph node
GNN for multi-view learning [3]: treat each view as a graph node
GNN for clustering [10]
Wang, Xiaolong, Yufei Ye, and Abhinav Gupta. “Zero-shot recognition via semantic embeddings and knowledge graphs.” CVPR, 2018.
Lee, Chung-Wei, et al. “Multi-label zero-shot learning with structured knowledge graphs.” CVPR, 2018.
Wang, Dongang, et al. “Dividing and aggregating network for multi-view action recognition.” ECCV, 2018.
Ma, Guixiang, et al. “Deep Graph Similarity Learning: A Survey.” arXiv preprint arXiv:1912.11615 (2019).
Hongyang Gao, Shuiwang Ji: Graph U-Nets. CoRR abs/1905.05178 (2019)
Kipf, Thomas N., and Max Welling. “Semi-supervised classification with graph convolutional networks.” arXiv preprint arXiv:1609.02907 (2016).
Veličković, Petar, et al. “Graph attention networks.” arXiv preprint arXiv:1710.10903 (2017).
Hamilton, Will, Zhitao Ying, and Jure Leskovec. “Inductive representation learning on large graphs.” NeurIPS, 2017.
Vaswani, Ashish, et al. “Attention is all you need.” NeurIPS, 2017.
Bo, Deyu, et al. “Structural Deep Clustering Network.” Proceedings of The Web Conference 2020. 2020.
Veličković, Petar, et al. “Pointer Graph Networks.” arXiv preprint arXiv:2006.06380 (2020).
Chen, Ming, et al. “Simple and deep graph convolutional networks.” arXiv preprint arXiv:2007.02133 (2020).
Klicpera, Johannes, Aleksandar Bojchevski, and Stephan Günnemann. “Predict then propagate: Graph neural networks meet personalized pagerank.” arXiv preprint arXiv:1810.05997 (2018).
Dosovitskiy, Alexey, et al. “An image is worth 16x16 words: Transformers for image recognition at scale.” arXiv preprint arXiv:2010.11929 (2020).
Carion, Nicolas, et al. “End-to-End Object Detection with Transformers.” arXiv preprint arXiv:2005.12872 (2020).
Chen, Hanting, et al. “Pre-Trained Image Processing Transformer.” arXiv preprint arXiv:2012.00364 (2020).
Chefer, Hila, Shir Gur, and Lior Wolf. “Transformer Interpretability Beyond Attention Visualization.” arXiv preprint arXiv:2012.09838 (2020).
An empirical study on evaluation metrics of generative adversarial networks [1] with code.
Inception Score (IS): classification score using the InceptionNet pretrained on ImageNet
in which $p_M(y)$ is the marginal distribution of $p_M(y|x)$. Expect $p_M(y)$ to be of low entropy while $p_M(y|x)$ to be of high entropy. The higher, the better.
Mode score: extension of Inception score
Kernel MMD: MMD distance between two data distributions
Wasserstein distance: Wasserstein distance (Earth mover’s distance) between two data distributions.
Fréchet Inception Distance (FID): extract InceptionNet features and measure the data distribution distance. The lower, the better.
KNN score: treat true data as positive and generated data as negative. Calculate the leave-one-out (LOO) accuracy based on 1-NN classifier.
Learned Perceptual Image Patch Similarity (LPIPS): [3] [code]
catheter, cannulation, EM tracking (enhance visualization and provide objective metric)
Early endovascular technique: real-time fluoroscopy and 2D angiography: ionizing radiation and repeated injection of a nephrotoxic contrast agent.
Image fusion techniques: project 3D CT and magnetic resonance imaging to real-time 2D fluoroscopic images, still require real-time fluroscopy.
Electromagnetic (EM) tracking: an EM field is generated by the Aurora Window Field Generator, and sensors on the tips measure and transmit the roll orientation and forward motion. The veracity of EM tracking is evaluated on the basis of target registration error (TRE).

Different locations of vessels correspond to the tasks with different difficult levels.

Certain metric is required to evaluate or augment the skills of surgeon (novice, intermediate, expert).
1.expert observation and subjective score
2.number of cases
3.kinematic metrics: mapping from kinematic data to skill (classification)
Borrowing aviation terminology, these rotations will be referred to as yaw, pitch, and roll:
A yaw is a counterclockwise rotation of \( \alpha\) about the \(z\)-axis. The rotation matrix is given by
A pitch is a counterclockwise rotation of \( \beta\) about the \( y\)-axis. The rotation matrix is given by
A roll is a counterclockwise rotation of \( \gamma\) about the \( x\)-axis. The rotation matrix is given by
Note that \( R(\alpha,\beta,\gamma)\) performs the roll first, then the pitch, and finally the yaw. If the order of these operations is changed, a different rotation matrix would result.
For gaze direction, roll does not change gaze direction, so only yaw and pitch affect gaze direction. Given a normalized 3D vector (x,y,z), how to determine the yaw and pitch angles?
The problem should be discussed based on the order of doing yaw/pitch.
Consider an eye rigid model (bound with a head rigid model), aligned with original coordinate system, is facing x positive direction. Since roll has no effect on eye direction, we only perform yaw and pitch. For coordinate transformation, we consider the reverse process.
The eye direction in new coordinate system is \(c_1 = (1,0,0)\) but \(c_2 = (x_0,y_0,z_0)\) in the original coordinate system.
If true rotation order is yaw->pitch, then . Then, \(\beta=arsin(z_0),\alpha=-artan(y_0/x_0)\).
If true rotation order is pitch->yaw, then . Then, \(\alpha=-arsin(y_0),\beta=artan(z_0/x_0)\).
If we insert \(R_x(\gamma)\) before \(c_1\), the results won’t change, which demonstrates that roll will not influence eye direction. In other words, if the true rotation order is yaw->pitch->roll or pitch->yaw->roll, the above analysis still holds.
Notice:
-