The increasing practicality of large-scale flow capture makes it possible
to conceive of traffic analysis methods that detect and identify a large
and diverse set of anomalies. However the challenge of effectively
analyzing this massive data source for anomaly diagnosis is as yet
unmet. We argue that the distributions of packet features (IP addresses
and ports) observed in flow traces reveals both the presence and the
structure of a wide range of anomalies. Using entropy as a summarization
tool, we show that the analysis of feature distributions leads to
significant advances on two fronts: (1) it enables highly sensitive
detection of a wide range of anomalies, augmenting detections by
volume-based methods, and (2) it enables automatic classification of
anomalies via unsupervised learning. We show that using feature
distributions, anomalies naturally fall into distinct and meaningful
clusters. These clusters can be used to automatically classify anomalies
and to uncover new anomaly types. We validate our claims on data from two
backbone networks (Abilene and Geant) and conclude that feature
distributions show promise as a key element of a fairly general network
anomaly diagnosis framework.