QoI : Assessing Participation in Threat Information Sharing

QoI: Assessing Participation in Threat Information Sharing Jeman Park

Outline • Threat Information (TI) Sharing • Quality of Indicator • System Architecture • Methodology (Numerical Scoring) • Dataset • Results • Conclusion

Threat Information (TI) Sharing <Example of structured TI sharing [1]> • Threat information is shared with trusted partners using information sharing standards. [1] https://stixproject.github.io

Threat Information Sharing, cont. • A better countermeasure can be found only if users actively share meaningful threat information. • There are two ways to evaluate the user’s contribution: • Quantity: How much is the information the user contributes? • Quality: How useful is the information the user contributes? • However, the current contribution measurement mainly focuses on quantitative aspects.

Quality of Indicator (QoI) • We identified four metrics to be used for qualitative evaluation of user’s contribution: • Correctness:captures whether attributes of an indicator (e.g., label used for attribution) are consistent with the assessor’s reference. • Relevance: measures the extent to which an indicator is contextual and of interest to the rest of the community. • Utility: captures whether an indicator characterizes prominent features of cyber-threats. • Uniqueness: is another assessor of quality, which is defined as a measure of similaritywith previously seen indicator. • By aggregating the scores of these four metrics, the aggregated QoI score can be generated.

System Architecture

System Architecture, cont. • Defining Metrics: • Quality metrics are determined to be used as measurement criteria to generate the score of threat indicators that users provide. • Defining Labels: • Annotations can be labels capturing either the type of threat, the level (of severity, timeliness, etc.) or the quality type of an indicator. • Utilizing these annotations, a scoring method is used to convert the quality labels to a numeric scorefor the indicator.

System Architecture, cont. • Building a Reference: • The reference dataset is used to evaluate QoI for a sample of indicators submitted by a sample provider. • To build the initial reference dataset, data that is collected through security operations is vetted for their validity and applicability. • Extrapolating: • Extrapolation allows each assessor to predict the label of an indicator using its feature set and classifier model. • The classifier is trained using a supervised learning process extracted from the reference dataset. • We use Nearest Centoid Classifier (NCC), a classification model that assigns to observations the label of the class of training samples whose mean (centroid) is closest to the observation.

Numerical Scoring • To illustrate the example of QoI, we conducted QoI scoring on the Anti-Virus (AV) Vendors and their malware families. • Correctness: • The reference dataset is used as the benchmark for determining the correct label for an arbitrary sample. • After building and training the classifier, the assigned label by the vendor is compared with the predicted label by the classifier and a positive score is given if labels match. • Relevance: • The weight values are chosen based on the interest of community members, and a mapping function is defined to assign weights, for example, giving higher weight on trojanand lower weight on DDoS.

Numerical Scoring, cont. • Utility: • The utility score is assigned differently depending on the type of label submitted by the vendor. • For instance, we use three classes: • complete labels: industrially popular name. • generic labels: commonly used names such as ‘generic’, ‘worm’, and ‘trojan’. • incomplete labels: ‘suspicious’, ‘malware’, and ‘unclassified’. • Uniqueness: • Malware samples that have not been seen before (e.g., no common hash values in current dataset) are given a high uniqueness score. • Aggregated QoI: • The weights are set according to the importance of each metric, all weighted scores are summed to calculate the aggregated QoIscore.

Dataset • For the evaluation of QoI, we used the dataset of 11 malware families submitted to VirusTotal by 48 AV vendors. (# of samples, family) • Avzhan (3,458, DDoS) • Darkness (1,878, DDoS) • Ddoser (502, DDoS) • Jkddos (333, DDoS) • N0ise (431, DDoS) • ShadyRAT (1,287, APT) • DNSCalc (403, APT) • Lurid (399, APT) • Getkys (953, APT) • Zero Access (568, rootkit) • Zeus (1,975, banking trojan)

Result • Correctness: • Some vendors (4, 27 and 30) outperformed other vendors with a score in the 80s up to top 90s range. • The majority of vendors have significantly lower correctness-based contribution measures than the volume-based score. • Relevance: • Certain contributors (42 and 43) with high volume-based scores have a low score of relevance, while some others (10, 16, and 27) have the opposite score pattern.

Result, cont. • Utility: • Certain vendors (39 through 41) are rated as high utility indicator providers that surpass their volume-based scores. • Aggregated QoI: • Some vendors (39 and 46) with low QoI scores are rated with higher volume-based scores, potentially alluding to free-riding. • Some other vendors (1, 8, and 33) contribute small volume of indicators but get high QoIscores, which means they share appropriateand useful information.

Conclusion In this paper, we have the first look at the notion of the quality of indicators (QoI). As empirically analyzed, identifying the levels of contribution cannot be simply expressed in the volume-based measure of contribution. By verifying our metrics on a real-world data of antivirus scans, we unveil that contribution measured by volume is not always consistent with those quality measures.

Thank You

QoI : Assessing Participation in Threat Information Sharing