DARTS
2019-ICLR-DARTS Differentiable Architecture Search
- Hanxiao Liu、Karen Simonyan、Yiming Yang
- GitHub:2.8k stars
- Citation:557
Motivation
Current NAS method:
- Computationally expensive: 2000/3000 GPU days
- Discrete search space, leads to a large number of architecture evaluations required.
Contribution
- Differentiable NAS method based on gradient decent.
- Both CNN(CV) and RNN(NLP).
- SOTA results on CIFAR-10 and PTB.
- Efficiency: (2000 GPU days VS 4 GPU days)
- Transferable: cifar10 to ImageNet, (PTB to WikiText-2).
Method
Search Space
Optimization Target
Our goal is to jointly learn the architecture α and the weights w within all the mixed operations (e.g. weights of the convolution filters).
Discrete Arch
Experiments
Arch Evaluation
Result Analysis
Conclusion
- We presented DARTS, a simple yet efficient NAS algorithm for both CNN and RNN.
- SOTA
- efficiency improvement by several orders of magnitude.
Improve
- discrepancies between the continuous architecture encoding and the derived discrete architecture. (softmax…)
- It would also be interesting to investigate performance-aware architecture derivation schemes based on the shared parameters learned during the search process.