This post generated many comments, including an interesting observation from Yoshua Bengio, one of the leading experts on Machine Learning and Deep Learning.
He wrote "I agree with (Zack Lipton) analysis, and I am glad that you have put this discussion online."
Yoshua continued:
My conjecture is that *good* unsupervised learning should generally be much more robust to adversarial distortions because it tries to discriminate the data manifold from its surroundings, in ALL non-manifold directions (at every point on the manifold). This is in contrast with supervised learning, which only needs to worry about the directions that discriminate between the observed classes. Because the number of classes is much less than the dimensionality of the space, for image data, supervised learning is therefore highly underconstrained, leaving many directions of changed "unchecked" (i.e. to which the model is either insensitive when it should not or too sensitive in the wrong way).