cnn就是self-attention的特例。self-attention更灵活,但是如果训练集小可能更容易过拟。 4. self-attention vs RNN 5. self-attention for Graph:用attention来表示nodes之间的关联
京公网安备 11010502049817号