Turning Customer Data into Strategic Advantage with Splunk MLTK
24 Sep 2025
Author: Mahmut Arda Cömertoğlu – Information Security Engineer at Sekom
In today’s digital era, data is often referred to as the “new oil.” Organizations shape their sales, investment, and security strategies based on the data they possess. At this point, enriching data with external sources, adding value through methods such as machine learning, and communicating findings to senior management in a clear and actionable way have become critical.
Beyond generating value from data, analyzing potential attack patterns and preventing data breaches before they occur is equally important. We increasingly see organizations suffer losses worth billions of dollars due to data breaches.
With its robust technological capabilities, Splunk has long been a market leader in securing organizations and turning raw data into meaningful insights. In addition to serving as a watchtower against internal and external threats, Splunk leverages machine learning and advanced analytics to unlock the true potential of organizational data.
How Did We Add Value to Customer Data Using Splunk MLTK?
Our customer operated an ecosystem where data from multiple vendors was centralized on a single platform. The key requirement was to understand traffic patterns for each vendor and detect potential anomalies using machine learning.
To achieve this, we first examined the existing data and its attributes. Next, we selected the appropriate machine learning method from the Splunk Machine Learning Toolkit (MLTK). After careful consideration, we decided to use Smart Outlier Detection, a technique designed to identify anomalies in numerical data, primarily leveraging density-based algorithms in the background.
This approach learns the normal distribution for selected metrics and flags data points outside expected boundaries as anomalies.
As part of the Define Data Source step, we built a custom Splunk SPL query to separate traffic by vendor. Then, in the Learn Data step, we defined Field to analyze as traffic and Split by fields as vendor. This made it possible to analyze traffic logs by vendor for anomaly detection. The distribution type was left as default, while the Outlier tolerance threshold was set to 0,019.
The results revealed where anomalies were concentrated and which vendors maintained stable traffic patterns. Two key metrics were instrumental in this analysis: std/mean (volatility) and Wasserstein distance (the difference between the learned and observed distributions).
- Vendors B and E displayed high mean + high volatility and high distance values, signaling elevated risk.
- Vendor D, with low std/mean and low distance values, exhibited much greater stability.
Next, the model generated in Splunk MLTK needed to be deployed within Splunk Enterprise Security (ES) as a defined model. A scheduled SPL query was then created to retrain the model every 24 hours, ensuring it remained up to date.
Finally, we implemented event-based detection in ES, enabling real-time anomaly alerts and the ability to monitor boundary thresholds effectively. Once activated, the anomalies could be tracked seamlessly via the Analyst Queue dashboard.
Conclusion
Splunk’s leadership in the industry comes not only from its powerful security capabilities but also from its ability to transform raw data into valuable insights through methods such as machine learning. With Splunk and our team’s expertise, organizations can shape strategies across multiple domains—from security to operations—using actionable, data-driven intelligence.
Ready to turn your raw data into strategic insights with Splunk MLTK?
Contact us today and start unlocking the true potential of your data refinery.