Asset management is a foundational IT service that many organizations continue to struggle to provide. Worse yet, and from the security perspective, this affects all of the secondary and tertiary services that rely on this foundation such as vulnerability management, security monitoring, analysis, and response to name a few. While it is very rare that an organization has an up-to-date or accurate picture of all of their IT assets, it is even rarer (think rainbow-colored unicorn that pukes Skittles) that an organization has an accurate picture of the criticality of their assets. While some do a decent job when standing up a CMDB of mapping applications to supporting infrastructure and ranking their criticality (although many tend to use a binary critical/not critical ranking), these criticality rankings are statically assigned and if not updated over time may turn stale. Manual re-evaluation of assets and applications in a CMDB is a time-consuming task that many organizations, after the pain of setting up the CMDB in the first iteration, are not willing or likely to make re-certification of assets and criticality rakings a priority…and it is easy to understand why. My other issue is that many CMDBs sometimes take into account the “availability” factor of an asset over the criticality of the assets from a security perspective. For example, it is not uncommon to see a rigorous change management process for assets in the CMDB with a slightly less rigorous (or non-existent) change management process for non-CMDB assets. But I digress…to summarize my problem:
- Asset criticality often does not exist or is assigned upon the asset’s entry into a central tracking mechanism or CMDB
- The effort to manually determine and recertify asset criticality is often so great that manual processes fail or produce inaccurate data
- In order for asset criticality data to be useful we may need near real time views of the criticality that change in concert with the asset’s usage
- Without accurate asset inventories and criticalities we cannot accurately represent overall risk or risk posture of an organization
The impact of inaccurate asset inventories and lack of up-to-date criticality rankings got me thinking that there has to be a better way. Being that I spend a majority of my time in the security monitoring space, and now what seems to be threat intel and security/data analytics space, I kept thinking of possible solutions. The one factor I found to be in common with every possible solution was data. And why not? We used to talk about the problems of “too much data” and how we were drowning in it…so why not use it to infer the critically of assets and to update their critically in an automated fashion. Basically, make the data work for us for a change.
To start I looked for existing solutions but couldn’t find one. Yes, some vendors have pieces of what I was looking for (i.e. identity analytics), but no one vendor had a solution that fit my needs. In general, my thought process was:
- We may be starting with a statically defined criticality rating for certain assets and applications (i.e. CMDB), and I’m fine with that as a starting point
- I need a way to gather and process data that would support, or reject, the statically assigned ratings
- I also need a way to assign ratings to assets outside of what has been statically assigned (i.e. critical assets not included in CMDB)
- The rating system shouldn’t be binary (yes/no) but more flexible and take into account real-world factors such as the type/sensitivity of the data stored or processed, usage, and network accessibility factors
- Assets criticalities could be inferred and updated on a periodic (i.e. monthly) or real-time basis through data collection and processing
- The side-benefit of all of this would also include a more accurate asset inventory and picture that could be used to support everything from IT BAU processes (i.e. license management) and security initiatives (i.e. VM, security monitoring, response, etc.)
These 6 thoughts guided the drafting of a research paper, posted here (http://www.malos-ojos.com/wp-content/uploads/2014/08/DGRZETICH-Ideas-on-Asset-Criticality-Inference.pdf), that I’ve been ever so slowly working on. Keep in mind that the paper is a draft and still a work in progress and attempts to start to solve the problem using data and the idea that we should be able to infer the criticality of an asset based on models and data analytics. I’ve been thinking about this for a while now (the paper was dated 6/26/2013) and even last year attempted to gather a sample data set and to work with the M.S. students from DePaul in the Predicative Analytics concentration to solve but that never came to fruition. Maybe this year…
While this isn’t a post on what threat intelligence is or is not I’d be negligent if I didn’t at least begin to put some scope and context around this term as the focus of this post is on making threat data and intelligence actionable. Not to mention, every vendor and their grandmother is trying to use this phase to sell products and services without fully understanding or defining its meaning.
First, it is important to understand that there is a difference between data and threat intelligence. There are providers of data, which is generally some type of atomic indicator (i.e. IOC) that comes in the form of an IP address, URL, domain, meta data, email addresses or hash. This data, in its least useful form, is a simple listing of indicators without including attribution of the threat actor(s) or campaigns with which they are associated. Some providers include the malware/malware family/RAT that was last associated with the indicator (i.e. Zeus, cryptolocker, PlugX, njRAT, etc.) and the date associated with the last activity. Some other providers focus on telemetry data about the indicator (i.e. who registered the domain, geolocated IP, AS numbers, and so on). Moving up the maturity scale and closer to real intel are providers that track a series of indicators such as IP, domains/subdomains, email addresses and meta data to a campaign and threat actor or group. If we add the atomic indicators plus the tactics (i.e. phishing campaigns that include a weaponized PDF that installs a backdoor that connects to C2 infrastructure associated with a threat actor or group) used by the threat actors we start to build a more holistic view of the threat. Now that we understand tactics, techniques and procedures (TTPs) and capability or our adversaries, we focus on the intent of the actors/groups or personas and how their operations are, or are not, potentially being directed at our organization. The final piece of the equation, which is partially the focus of this post, is understanding how we take these data feeds, enrich them, and then use them in the context of our own organization and move towards providing actual threat intelligence – but that is a post on its own.
Many organizations think that building a threat intelligence capability is a large undertaking. To some extent they are correct in the long term/strategic view for a mature threat intel program that may be years down the road. However, the purpose of this post is to argue that even with just a few data and intel sources we can enable or enhance our current capabilities such as security monitoring/analysis/response and vulnerability management services. I honestly chose these services as they fit nicely in my reference model for a threat monitoring and response program as well as threat intel which is at the center of this reference model. So let’s walk through a few examples…
Enrichment of Vulnerability Data
Vulnerability assessment programs have been around for what seems like forever, but mature vulnerability management programs are few and far between. Why is this? It seems we, as security professionals, are good at buying an assessment technology and running it and that’s about it. What we aren’t very good at is setting up a full cycle vulnerability management program to assign and track vulnerability status throughout the lifecycle. Some of the reasons are due to historical challenges (outlined in more detail in a research paper I posted here: http://goo.gl/yzXB4r) such as poor asset management/ownership information, history of breaking the infrastructure with your scans (real or imagined by IT), or way too many vulnerabilities identified to remediate. Let’s examine that last challenge of having too many vulnerabilities and see if our data and intel feeds can help.
Historically what have security groups done when they were faced with a large number of vulnerabilities? The worst action I’ve seen is to take the raw number of vulnerabilities and present them as a rolling line graph/bar chart over time. This type of reporting does nothing to expose the true risk, which should be one of the main outputs of the vulnerability management program, and infuriates IT by making them look bad. Not to mention these “raw” numbers generally tend to include the low severity vulnerabilities. Do I really care that someone can tell what timezone my laptop is set to? I don’t know about you but I doubt that is going to lead to the next Target breach. Outside of raw numbers, the next type of action usually taken is to assign some remediation order or preference to the assessment results. While a good start, most security teams go into “let’s just look at sev 4 and sev 5 vulnerabilities” mode which may result in what amounts to a still very large list. Enter our threat data…
What if we were able to subscribe to a data feed where the provider tracked current and historical usage of exploits, matched the exploit with the associated vulnerabilities, and hence the required remediation action (i.e. apply patch, change configuration, etc.)? This data, when put into the context of our current set of vulnerabilities, becomes intelligence and allows us prioritize remediation of the vulnerabilities that impose the greatest risk due to their active use in attack kits as well as non 0-day exploits being used by nation state actors. As a side note, among a few vendors there is a myth being spread that most all nation-state attacks utilize 0-days, which I find to be an odd statement given that we are so bad at securing our infrastructure through patch and configuration management that it is likely that an Adobe exploit from 2012 is going to be effective in most environments. But I digress.
So how much does using threat data to prioritize remediation really help the program in reducing risk? In my research paper (here: http://goo.gl/yzXB4r) I noted that limiting to sev 4 and sev 5 as well as using threat data it is possible to reduce the number of systems that require remediation by ~60% and the discrete number of patches that needed to be applied was reduced by ~80%. While one may argue that this may still result in a high number of patches and/or systems requiring treatment I would counter-argue that I’d rather address 39,000 systems versus 100,000 and apply 180 discrete patches over 1000 any day. At least I’m making more manageable chunks of work and the work that I am assigning results in a more meaningful reduction of risk.
Integrating Your Very First Threat Feed – How Special
In addition to creating a reference model for a security monitoring, analysis and response programs (which includes threat intel) I also built out a model for implementing the threat intel service which includes a 4 step flow of: 1. Threat Risk Analysis, 2. Acquisition, 3. Centralization, and 4. Utilization. I’ll detail this model in a future post and the fact that in a mature service there would be a level of automation, but for now I’d like to point out that it is perfectly acceptable to build a threat intel program as a series of iterative steps. By simply performing a threat risk assessment and understanding or defining the data and intel needs an organization should then be able to choose a data or intel provider that is suitable to their goals. Ironically I’ve witnessed a few organizations that went out and procured a feed, or multiple feeds, without understanding how it was going to benefit them or how it would be operationalized…I’ll save those stories for another day. And while I’m not going to cover the differences between finished intel versus indicators/data in this post, it is possible for an organization to procure feeds (open source and commercial feeds) and instrument their network to prevent activity, or at a minimum, detect the presence of the activity.
As an example, let’s say that we have a set of preventive controls in our environment – firewalls, web/email proxies, network-based intrusion prevention systems, and end point controls such as AV, app whitelisting, and host-based firewalls. Let’s also say we have a set of detective controls that includes a log management system and/or security information and event management (SIEM) which is being fed by various network infrastructure components, systems and applications, and our preventive controls mentioned above. For the sake of continuing the example let’s also say that I’m in an industry vertical that performs R&D and would likely be targeted by nation state actors (i.e. this Panda that Kitten) in addition to the standard Crimeware groups and hacktivists. With this understanding I should be able to evaluate and select a threat intel/data provider that could then be used to instrument my network (preventive and detective controls) to highlight activity by these groups. At this point you would start asking yourself if you need a provider that covers all of the type of threat actors/groups, if you need vertical-specific feeds, and if you need to ensure that you have a process to take the feeds and instrument your environment? The answer to all three is likely to be yes.
Continuing with the example, let’s say I selected a provider that provides both analyst-derived/proprietary intel in addition to cultivating widely available open source information. This information should be centralized so that an operator can assess the validity and applicability of the information being shared and determine the course of action on how to integrate this into the preventative and/or detective controls. A simple example of this may be validating the list of known-bad IPs and updating the firewall (network and possibly host-based) with blocks/denies for these destinations. Or, updating the web proxy to block traffic to known bad URLs or domains/sub-domains. One thing that shouldn’t be overlooked here would be that we trigger an alert on this activity for later reporting on the efficacy of our controls and/or the type of activity we are seeing on our network. This type of data is often lacking in many organization and they struggle to create a management-level intel reports that are specific to the organization that highlight the current and historical activity being observed. In addition, we could also take the indicators and implement detection rules in our log management/SIEM to detect and alert on this activity. Again, keep in mind that for an organization just standing up a threat intel service these may be manual processes that have the possibility of being partially automated in a later or more mature version of the service.
As a side note, one thing I’ve noticed from many of the SIEM vendors is how they try to sell everyone on the “intel feeds” that their product has and how they are “already integrated”. The problem I have with this “integration” is that you are receiving the vendor’s feed and not one of your choosing. If SIEM vendors were smart they would not only offer their own feeds but also open up integrations with customer-created feeds that are being generated from their intel program. As it stands today this integration is not as straight-forward as it should be, then again, we also aren’t doing a very good job of standardizing the format of our intel despite STIX/CyBOX/TAXII, OpenIOC, IODEF, etc. and the transfer mechanism (API, JSON, XML, etc.) being around for a while now.
To round out this example, it is also important to note that as we instrument our environment that we track the alerts generated based on our indicators back to the category or type (i.e. nation-state, crimeware, hacktivist, etc.) and if possible track back to the specific origin of the threat (i.e. Ukrainian crimeware groups, Deep Panda, Anonymous, etc.). This is going to be key in monitoring for and reporting on threat activity so we can track historical changes and better predict future activity. We can also use this information to re-evaluate our control set as we map the attacks by kind/type/vector and effectiveness (i.e. was the attack blocked at delivery) or the in-effectiveness (i.e. was a system compromised and only detected through monitoring) and map these against the kill chain. This type of information translates into both overall security architecture and funding requests very well.
While this is a new and seemingly complex area for information security professionals it really isn’t that difficult to get started. This post highlighted only a few simple examples and there are many more that could be part of your phase 1 deployment of a threat intel service. My only parting advice would be to make sure you have a strategy and mission statement for the service, determine your threat landscape, define what you need out of your feeds and acquire them, centralize the information and utilize it by instrumenting and monitoring your environment. While some of the vendors in this space have great sales pitches and even really cool logos, you had better understand your requirements (and limitations) prior to scheduling a bunch of vendor demos.
Who likes dependencies anyway??? Not me…so here is a shell script to get Cuckoo Sandbox v1.1 installed
I realized that I was spending an inordinate amount of time when rebuilding Cuckoo Sandbox (http://cuckoosandbox.org) in my home lab just because I was starting from a fresh Ubuntu install which does not ship with all of the dependencies and packages that are required by Cuckoo. I also break this system quite often and in such specular ways that the only recovery mechanism is to rebuild the system from the OS up. This, unfortunately, also leads back to spending way too much time post-OS install in rebuilding Cuckoo. There has to be a better way…and so there is using a shell script I wrote to get me up and running in no time after a rebuild.
So what do I need to run this script?
The script (located here: cuckoo_install - right-click and save as, rename to .sh) assumes you have a base install of Ubuntu 12.04LTS and that you have updated through an apt-get update and an apt-get dist-upgrade. It was also created to work specifically for Cuckoo Sandbox v1.1. Beyond that you’re on your own to set networking and the user accounts as you see fit. In my case I use the account created during OS install for everything on this system and I have a physically and logically segmented network just for the sandbox and the virtual machines used to detonate the malware. These systems are directly connected to the internet and sit behind a Cisco ASA which is logging all accepts and denies to a Splunk instance and the connection is tapped using a VSS 12×4 distributed tap and the traffic is captured using the free version of NetWitness Investigator. I’m also running a VM instance of INetSim (http://www.inetsim.org) that supplies DNS, FTP, and other services that may be required by the malware (i.e. through faking a DNS response to point the malware to a system I control).
What happens when I run the script?
Assuming your base Ubuntu system has connectivity to the internet it will proceed to download and install all of the required dependencies and packages required to run Cuckoo Sandbox v1.1 (again, this assumes you’re on 12.04LTS as a base OS). There is a built-in check at the start that will verify your version that will error out if you’re on something other than 12.04LTS. If you think this will work even if you’re not on 12.04LTS you can, at your own risk, comment out this section and force the script to run. The script runs in sections and requires that you hit enter before proceeding to the next section. I put this in so you could review the status of a section (i.e. no errors) before continuing on to the next section of the script. If you find that annoying simply comment out all of the “read” commands in the script and it will run start to end, however it becomes difficult to identify any install errors given the length of the output. Other than that the script will install what is required for Cuckoo, and after running you can address and errors or issues with the installed components to ensure everything is installed correctly.
What do I need to do following the script to get Cuckoo up and running?
This is going to be highly dependent on your individual setup, however you need to get your virtual machines built and/or transferred into VirtualBox and set the snapshots that will be used (plenty of good info on the net on this step such as http://santi-bassett.blogspot.com/2013/01/installing-cuckoo-sandbox-on-virtualbox.html). You also need to add your user account to the virtualbox group, download the malware.py file if you plan on using Volatility, and setup your network for your particular needs.
Can I modify the script and/or what if it doesn’t work?
I’m posting this script as-is. It works for my needs in my lab environment which may not be the same as yours. Feel free to mod it as required, however all I ask is if you make significant improvements to the script that you share it back to the community. I’m not going to actively maintain the script or make modifications in the future as this is a one shot deal (I have a $dayjob that actually pays the bills).
Note: If you’re new to Cuckoo or Ubuntu I’d actually recommend trying a manual install if you have time. I realized I learned quite a lot about the required packages and how the system functions when I struggled to get Cuckoo up and running a few years ago. It makes troubleshooting issues I encounter now much easier.
Research paper on Snort rule development for the major fault attack on Allen Bradley MicroLogix 1400 controllers
As part of a course I took last quarter at DePaul University on critical infrastructure security I drew the straw on one of our group labs which required that we write a Snort signature for an attack on the Allen Bradley MicroLogix 1400 series controllers. The attack was written by Matt Luallen of Cybati in September of last year for Metasploit which sets a bit on a data file on the controller which indicates to the controller that there is a major logical fault. This attack stops the running program on the controller and must be manually cleared (either through physical interaction with the controller or by clearing the fault using the RSMicroLogix application).
The results of this research project will likely be published in the future in a more formal fashion, but until then I wanted to post a sneak peek at the report for those who may be interested. Note that I wrote this a few months ago and held off on publishing it as it was being copy edited for publication. As I assume that process had died I am left with no choice but to publish this work…no sense in holding on to something to could be of value to someone else.
A link to the PDF is here.
Since I don’t have time to actually write the articles I want to, I thought I’d add a post to share my collection of photos of broken systems. These are systems I find in public places, like hotels, airports, and grocery stores and I take a picture. So here’s my collection:
The photos above are pretty old, but as I was walking past the Chase ATM at the local Dominick’s I noticed that a start menu was displayed on the screen. And being curious I had to touch it to see what was installed. Oddly enough it had Windows Movie Maker, which I find to be a strange application to have installed on an ATM. Also curious that ActiveState Perl and Acrobat Reader were installed…would seem to me that the image for the ATMs was bloated.
Above is the airport collection, although I could swear that I had more of these…Flight notification screens, baggage claim, and an internet kiosk. By the way, who in the hell uses this system? I have a feeling this some sucker bought this in a “get rich overnight” scam where they “own” the system and sit at home and make it big! Suckers.
As much as I enjoy staying at the Cosmopolitan in Vegas they always seem to have an issue with that cool display system they run in the lobby, throughout the casino, and the elevators. The first one was just a licensing issue with the software, which I originally took so I could remember the name of the app. Surprisingly, it isn’t that expensive. The second photo is from the LCD screens in the elevator.
Keeping with the elevator track, the above two photos are from the elevators in the Aon Center where my office is…I think. I’m not really sure since I only go there once a month or so. I wasn’t sure if I should be worried if I’d get stuck in the elevator, but then I remembered that the screen on the left shows the “elevator” news so no one needs to make eye contact on the long elevator rides up to the 59th floor. Assuming that the IP is a true public then it comes back as owned by Savvis in Missouri somewhere just south of Chesterfield. Oh, and the kernel version is from 2007 and the SSID is bay15.
The last three are randoms. First is from DePaul University in the lobby where the app running the kiosk crashed…why? In the second one I think Best Buy needs to dispatch the Geek Squad, although this seems to be a Flash issue if watchdog.sys is causing the BSOD. Finally, a bar with a broken poker machine….running Linux.
JSPSpy is an interesting tool that once uploaded to a server that supports JSP pages gives you a user interface on the web server itself. Its power comes from the ability to upload/download, zip, and delete files at will on the web server as well as spawn a command prompt. In addition, if you are able to gain credentials to a database server serving the web application (say through an unencrypted database connection string) it has a database connection component as well which would allow one to crawl a backend database server for information.
There is one issue with the code, which I find odd given that it was created in 2009, in that the SQL driver and URL for the connection using JDBC is incorrect. Well, not incorrect, the issue is that it supports SQL Server 2000. Starting with SQL 2005 the driver and URL were changed…and the code for JSPSpy which is easily accessible on the internet has an old connection string.
In addition, there are a few more UI’s for crawling a SQL backend using JSP floating around as well. I’ve included one in this demo as well.
The video demonstrates the power of JSPSpy in my demo environment consisting of Java 1.6, Tomcat 6.0, SQL 2005 and Windows 2003 Server. UPDATE: I updated the video on this as it appears it didn’t convert correctly and only shows in SD, not HD so the text is very hard to read. The new video below is in HD.
As a security professional it’s not often that people try to socially engineer me, especially over the phone. But, I thought the call I received was worthy of both a big laugh as well as a post. This got me thinking as well…is the going hourly rate for a person to sit and call people on the phone now low enough that it beats out automated malware and drive-bys? While I doubt that is the case I have to assume that since it is still a running scam, and I saw articles on this from August of this year, that they are making money. It also made me laugh as I took a trip down memory lane of having to do this as a consultant in a prior life, although I’d like to think my version was more convincing.
If you get it, here’s how the scam goes:
In my case it was a blocked call, and the person on the other end of the phone states they are with Microsoft. My guy’s name was Victor Dias (Indian accent) which didn’t quite make sense given his difficulty with spelling it when I asked. I’m kicking myself for not having a Win7 VM running at the time and following through on his instructions to see how this all ends, but I digress. He asked me to do some rudimentary things, such as go to Start, search for “ev”, and open the event viewer. Then he asked me if I have any errors or warnings in the Application logs, or if I have had any pop-ups stating that an application had crashed. Next, he asked if I had AV running (which of course I said no to) so he said “your computer is probably infected with the malwares (sp) and junks (sp), can you open Remote Assistance and allow me to connect so I can run a scan to remove the junks (sp)?”
Awesome! Going back to why I wanted to kick myself was that I didn’t have a Windows 7 system in front of me…I so wanted to see what he was going to do, and in hindsight what I may have been able to do to him (disclaimer: I’m not advocating offensive operations, wink wink). At this point I was done with the scam and started to ask him a series of questions. What is your name? Can you spell that? What is your MS employee ID number? BTW, he answered with 44398…ummm, pretty sure they are 6 digits and not 5, to which he said “oh yes, mine is 5 digits”. In fact, you can find this info online, so a little research prior to the scam never hurts (your welcome for the free advice, Victor). What finally broke him was when I asked where he was calling from. Manvil, TX, or Manville, TX…he couldn’t spell the name of the city he was in. Then I asked which major city in Texas was closest to his location…he couldn’t answer. So when I gave him options of cities he simply hung up, knowing he wasn’t getting anywhere with me.
So, I have a Win7 VM, my copy of NetWitness, and some surprises ready in case Victor calls back. Here’s hoping to hear from you, Victor.
I attended an ISACA presentation at DePaul the other evening given by Eric Karshiev from Deloitte on the Zeus malware family and had a few thoughts that I wanted to post (link to the event is at the end).
First, kudos to Eric for a decent presentation even though, self-admittedly, he hasn’t done much public speaking in his career….all I can say is that it only gets easier the more you force yourself to do it.
Second, while the presentation was at the right level of technical detail for an ISACA meeting, and I don’t mean that in a derogatory way ISACA, there were also some really good questions from the students in attendance, which was very encouraging. I do believe an important first step in defending your organization comes from a through understanding of the threats you face as well as your risk profile based on what your company does, how it does it, and your likelihood of being targeted by attackers in addition to the general opportunistic attacks we see on a daily basis.
That being said, I think there were some great questions that may not have been fully answered during the course of the presentation, and I’d like to list those here and take a shot at answering. I took the liberty of paraphrasing some questions and consolidating them where it made sense…so here we go:
1. What is the number one attack vector for malware in the recent past?
I made this question more broad and vague as was asked in the presentation, but I did that on purpose so I could answer it a few different ways. First, social engineering and targeting the users is nothing new, so that is has been and will be an attack vector that is used. More specifically, client-side browser exploits utilizing vulnerabilities in the browser, and more likely the plug-ins and 3rd party apps such as Adobe and Java (as an example, the new Adobe X 0-day that was, or will be, released soon). I think this has been standard operating procedure for attackers for the past 4 years given how insecure and under-patched many of these applications are. We are pretty good at patching the OS layer, but not so good at patching 3rd party applications, especially as they exists on mobile laptops that aren’t always connected to the corporate network. One thing to keep an eye on in this space is HTML5. If it ends up being as popular as Java/Flash look for an increase in vulnerability identification and use in attacks. Don’t believe me? Look at all of the exploit kits out there (last time I looked at my list I had 34 of them) and look at the CVE’s related to each of the exploit kits…they range from 2004-2011 and most target Java, Flash, and PDFs.
Want to see how insecure your 3rd party apps may be? Download and run Secunia PSI (free for personal use) and review the report.
2. Is Zeus targeted or opportunistic? Do I need to be more concerned about protecting a C-level exec, the rest of our users, or both?
Zeus, as a MITM banking Trojan, and by necessity is an opportunistic attack. If it can steal $5 or $5000 it doesn’t really matter. The more systems I have compromised the more money I can make, therefore from an attackers perspective it makes sense to spread this as far and wide as I can. I don’t mean to generalize here, but my advice is to protect all of your user’s systems in the same way when it comes to opportunistic threats. On the other hand, you do need to be concerned about targeted attacks against executives and ensure they, and their admins, understand that they may be targeted. For example, we trained the exec at the law firm to help them proactively identify a targeted phishing attack. One day we received a call from an exec stating that they received an email, it didn’t look legit, and had a PDF attachment that they didn’t open. We immediately reviewed the attached PDF and it was weaponized (although poorly) to infect the system with a dropper and connect back to C2 to get a binary. When we looked at the content of the email message we noticed that it was unique enough to comb through all received mail message for the same email and attachment. What we noticed was that 5 other messages like the one we had in our possession were sent, but only to executives of the firm. On top of that, each had a weaponized PDF attachment that was different from the others but had the same dropper functionality. The polymorphism was likely in place to evade IDS, mail filters, and AV…all of which were bypassed without issue.
3. You said AV isn’t effective given that it is signature based. What else can we do to protect users from being infected, and if we can’t protect them how can we detect malware?
This was a great question, and the one that actually spurred me to write this post, that went unanswered (at least to my satisfaction). Yes, part of AV detection is signature based, but so are mail filters and IDS/IPS systems. It is true that these commodity controls can protect us from the “known” malware that is floating around the internet, but it can’t protect us from new malware…I think this is an obvious statement given the number of systems that are compromised on a regular basis.
That being said, there are some controls we can implement that aren’t signature based that can detect malware based on behavior. Since I mentioned social engineering, it may be helpful to give our users a hand in determining the “goodness or badness” of emails they receive by ranking them. Email analytics is a good start, and products have now sprung up that play in this space. ProofPoint is an example of a tool that may empower your users and allow them to make better decisions about emails they receive and what to do with them. It isn’t full-blown security data analytics, but it is a start. Another example of a vendor in this space is FireEye with their email and web products, which can identify executable attachments in email and those received from clicking on internet links (or drive-by downloaded), analyze them in a sandbox, and make a determination of the as to their “badness”. Damballa is also another product focused on behavior analysis of malware as it uses the network…this makes sense as malware which doesn’t communicate to its owner isn’t very valuable. Their technology makes use of the known C2 systems as well as DGA-based malware generating many resolution requests and getting a bunch of NX’s back. Finally, Netwitness is an invaluable tool in both monitoring and incident response as it gives the visibility into the network that we have been lacking for so many years. And yes, there is a lot of overlap in these tools, so expect some consolidation in the coming years.
I don’t mean to push vendors as a solution and would never throw technology at a situation to fix the underlying root causes – unpatched OS, browsers, and 3rd party applications open a nice attack surface for the bad guys. Why do we allow our users full control of their system? Do they all need to be admin? We also don’t seem to be doing a great job of monitoring the network and all of the systems we own…what bothers me most is that the attackers are attacking us on home turf. We own the battlefield and keep getting our a$$es handed to us.
4. There was a comment on the use of Palo Alto and Wildfire in relation to the use of the cloud and how that may help.
Most all of the technologies mentioned above use the same mechanism, and this is nothing new as AV vendors have been doing this since they realized they could get good intel from all of their customers. My only caution is that the benefit realized from sending all bad binaries to a cloud service for analysis is that it is dependent on how good that analysis is.
So to close, my suggestion to anyone interested in malware prevention, detection, and analysis is that there are some great resources on the internet as well as some decent classes you can take to better understand this threat. If analysis is your thing then I’d recommend Hacking – The Art of Exploitation and Practical Malware Analysis as some good reads. Setup a lab at home and experiment with some of the tools and techniques used by past and current malware…nothing beats hands on work in this space as the more you know the better you are at malware identification and response.
Link to the presentation site – http://events.depaul.edu/event/zeus_malware_family_the_dark_industry#.UJ1baoXLDEs
In looking back at the last post about tapping your network prior an incident I thought to myself, why did I stop there…or more appropriately, why was I focused on having the right infrastructure in place? Perhaps it was out of frustration of showing up to a client and wondering why it was so difficult to start quickly monitoring the environment to get a handle on the issue we may, or may not, be dealing with. But when I thought about it what I really needed was a Delorean with a flux capacitor and a crazy Doc Brown of my own so I can travel back in time (no, I don’t own a vest jacket or skateboard but may on occasion rock some Huey). I don’t have a time machine and can’t go back in time, but what I can do is make some recommendations that we start capturing information that may be useful in an incident response PRIOR to one happening. I don’t mean to dismiss the ability to rapidly scale up your monitoring efforts during a response, even if that means dropping new tools into the environment or calling on 3rd parties to assist, but would be doing an injustice if I didn’t discuss what you should be collecting today.
Normally I’d be frustrated with myself for recommending that you collect information, logs, data, etc. and then do nothing with them. But, is that really a bad thing? I thought back to when I started at the law firm and many years ago when I asked what we monitored I was told we had network perimeter logs and that they were being sent to an MSSP for storage. Were we doing anything with these logs? No. Was the MSSP doing “some science” to them and telling us bad things may have been happening? No. So, at first I questioned the value, and sanity, of the decision to capture but do nothing with our information. But the more I thought about it the more I understood that IF something had happened I would/may have the information I need to start my investigation. Forensics guys may know this as the concept of the “order of volatility”, or how quickly something is no longer available for analysis. I’d say besides system memory that network connections would be very high up on that list. And if I weren’t at least capturing and storing these somewhere then they would be lost in the past because of their volatility. So, it isn’t that bad to just collect data, just in case.
I’d also like to temper the aspirations of folks who want to run out and log everything for the sake of logging everything. What I’d much rather see is that your logging and data collection be founded on sound principles. What you choose to focus on should be based on your risk, or what you’re trying to protect/prevent, and I hope to highlight some of this in the rest of this post. As an example, if you are logging successful authentications, or access to data it should be focused on your most valuable information. This will, for obvious reasons, vary from organization to organization. A manufacturing company is more likely trying to protect formulas, R&D data, plant specifications and not medical records like an organization in healthcare would. OK, on to the major areas of logging to focus on, or at least something to consider:
- Web proxies
- Routers/switches (netflow)
- Access to sensitive data via applications
- Database administrative actions/bulk changes
- Administrative access to DBs, applications, and systems (focus on critical systems first)
- SharePoint and other data repository access
- Email systems
- AV, Anti-malware, and end-point controls
- File integrity monitoring systems
- DLP and data loss prevention tools
- Identity management systems and access governance systems
- Vulnerability information
- Operational data such as configuration information and process monitoring
And as Appendix A.2.1 of NIST 800-86 says, have a process to collect the data that is repeatable and be proactive in collecting data – be aware of the range of sources and application sources of this information.
While I realize the recommendations of this post are rather remedial, I still find organizations who haven’t put the right level of thought into what they log and why. Basically, the recommendation I may make is to first understand what you may log today, identify gaps in the current set of logs and remediate as necessary, and design your future state around a solid process for what you plan to collect and what you plan to do with it. More to come…
In responding to incidents there is one thing that stands out that I felt deserved a post and that is the topic of network taps and visibility. While some large companies often have the necessary resources (i.e. money, time, engineers, other tools which require visibility into network traffic, etc.) to install and maintain network traffic taps or link aggregators, the number of companies I run into without ANY sort of tap or aggregator infrastructure surprises me. While it depends on the type of incident you’re dealing with, it is quite often the case that you’re going to want, better yet need, a very good view of your network traffic down to the packet level.
If you’re not convinced imagine this scenario: During a routine review of some logs you see that you have traffic leaving your US organization which is going to an IP address that is located somewhere in Asia. It appears to be TCP/80 traffic originating from a host on your network, so you assume it is standard HTTP traffic. But then you remember that you have a web proxy installed and all users should be configured to send HTTP requests through the proxy…so what gives? At this point your only hope is to view the firewall logs (hopefully you have these enabled at the right level), or you can go out and image the host to see what sites it was hitting and why. But, if you had packet level inspection available a simple query for the destination and source address would confirm if this is simply a mis-configured end user system, a set of egress rules the firewall that were left behind that allow users to circumvent the proxy, or if it is C2 traffic to/from an infected host on your network.
Having taps, SPAN/mirror ports, or link aggregators in place PRIOR to an incident is the key to gaining visibility into your network traffic, even if you do not possess the monitoring tools today. It allows response organizations to “forklift” a crate of tools into your environment and gain access to the network traffic they need to begin the investigation. The main benefit of tapping your infrastructure prior to an incident is that you don’t need to go through an emergency change control at the start of the incident just to get these taps, SPAN, or aggregators installed. This is also technology that your network team may not be familiar with configuring, installing, or troubleshooting. So setting up your tapping infrastructure up front and being able to test it under non-stressful conditions is preferred. That being said, it is also important to remember that there are pros and cons on how you pre-deploy your solution, both in terms of technology and tap location.
A couple of questions should be answered up front when considering how to approach this topic:
- Can our current switching infrastructure handle the increased load due to the configuration of a SPAN or mirror port?
- Will we have multiple network tools (i.e. Fireeye, Damballa, Netwitness, Solera, DLP, etc.) that need the same level of visibility?
- If we tap at the egress point what is the best location for the tap, aggregator, or SPAN?
- Do we know what traffic we are not seeing and why?
Taps vs. Link Aggregators vs. SPAN/mirror
The simplest way to gain access to network traffic is to configure a switch, most likely one near your egress point, to SPAN or mirror traffic from one or more switch ports/VLANs to a monitor port/VLAN which can be connected to the network traffic monitoring tool(s). The downside of SPAN ports is that you can overwhelm both the port and/or the switch CPU depending on your hardware. If you send three 1G ports to a 1G SPAN port, and the three 1G links are all 50% saturated at peak, you will drop packets on the SPAN port shortly after you surpass the bandwidth of the 1G port (oversubscription). The safest way to use a SPAN in this case is to mirror a single 1G port to a 1G mirror port. Also consider how many SPAN or mirror ports are supported by your switching hardware. Some lower end model switches will only support a single mirror port due to hardware limitations (switch fabric, CPU, etc.), while more expensive will be able to support many more SPAN ports. I’m not going to get into local SPAN vs. RSPAN over L2 vs. ERSPAN over GRE…that is for the network engineers to figure out.
Passive and active taps can alleviate some of the issues with dropped packets on a switch SPAN as they sit in-line to the connection being tapped and operate at line speed. The drawback is they may present a single point of failure as you now have an in-line connection bridging what is most likely your organization’s connection to the rest of the world. Also, keep in mind that passive taps have two outputs, one for traffic in each direction so you’ll need to ensure the monitoring tools you have or plan to purchase can accept this dual input/half duplex arrangement. Active taps on the other hand are powered so you’ll want to ensure you have redundancy on the power supply.
The last type of tap isn’t really a tap at all, but a link aggregator which allows you to supply inputs from either active/passive taps or switch SPAN ports which are then aggregated and sent to the monitoring tool(s). The benefit of an aggregator is that is can accept multiple inputs and supply multiple monitoring tools. Some of the more expensive models also have software or hardware filtering, so you can send specific types of traffic to specific monitoring tools if that is required.
Last but not least are the connection types you’ll be dealing with. Most monitoring tools mentioned in this post accept 1G copper up to 10G fiber inputs, depending on the tool and model. You also need to make sure your taps and/or aggregators have the correct types of inputs and outputs that will be required to monitor your network. If you’re tapping the egress point chances are you’re dealing with a 1G copper connection, as most of us rarely have a need for more than 1G of internet bandwidth. If you’re tapping somewhere inside your network you may be dealing with 1G, 10G, or fiber connections or a combination (i.e. 10/100/1000 Base-T, RJ-45 Ethernet, 1000 Base-Sx/Lx/Zx Ethernet multimode or singlemode), so keep this in mind as you specify your tapping equipment.
Location – Outside, Inside, DMZ, Pre or Post Proxy? What About VM Switches?
Next is the issue of location of the network tap and the answer to this really depends on what level of visibility you require. At a minimum I’d want to tap the ingress/egress points for my network, that is any connection between my organization and the rest of the world. But that doesn’t quite answer the question as I still have options such as outside the firewall, directly inside the firewall (my internal segment), or just after my web proxy or IPS (assumes in-line) or inside the proxy.
There are some benefits and drawbacks to each of these options; however I’m most often interested in traffic going between my systems and the outside world. The answer mainly depends on your network setup and the tools you have (or will have) at your disposal. If you tap outside the firewall then you can see all traffic, both traffic which is allowed and that which may be filtered (inbound) by the firewall. The drawback is both noise and the fact that everything appears to originate from the public IP address space we have as I’m assuming the use of NAT, overload NAT, PAT, etc. is in use in 99% of configurations. The next point to consider is just inside the firewall; however that depends on where you consider the inside to be. If we call it the inside interface (that which our end users connect through) then I will gain visibility into traffic pre-NAT which shows me the end-user’s IP address, assuming an in-line (explicit) proxy is not being used which would then make all web or other traffic routed through the proxy to appear to originate from the proxy itself. Not forgetting the DMZ, we may also tap our traffic as it leaves the DMZ segment as well through a tap or SPAN as that will allow for monitoring of egress/ingress traffic but not inter-DMZ traffic.
Pre or post-proxy taps need to be considered based on a few factors as well. If it is relatively simple to track a session that is identified post-proxy back to the actual user or their system, and is it cheaper for me to tap post-proxy, then go for it. If we really need to see who originated the traffic, and what that traffic may look like prior to being filtered by a proxy, then we should consider tapping inside the proxy. In most situations I’d settle for a tap inside the proxy, just inside the firewall prior to NAT/PAT, and just prior to leaving the DMZ segment. To achieve this you may be looking at deploying multiple SPANs/taps and using a link aggregator to aggregate the monitored traffic per egress point.
Finally, what about all the virtual networking? Well, there are point solutions for this as well. Gigamon’s GigaVue-VM is an example of new software technology that is allowing integration with a virtual switch infrastructure. While this remains important if we need to monitor inter-VM traffic, all of these connections out of a VM server (i.e. ESXi) need to turn physical at some point and are subject to the older physical technologies mentioned above.
This should be a standard section on encryption and how it may blind the monitoring tools. Some tools can deal with the fact that they “see” encrypted traffic on non-standard ports and report that as suspicious. Some don’t really care as they are looking at a set of C2 destinations and monitoring for traffic flows and amounts. If you’re worried about encryption during a response you probably should be…and if you’re really concerned consider looking into encryption breaking solutions (i.e. Netronome). Outside of the encryption limitation, after you deploy you tapping infrastructure your network diagrams should be updated (don’t care who does this, just get it done) to identify the location, ports, and type of component of your solution along with any limitations on traffic visibility. Knowing what you can’t see in some cases is almost as important as what you can see.
Find your egress points, understand the network architecture and traffic flow, decide where and how to tap, and deploy the tapping infrastructure prior to having a need to use it…even if you don’t plan on implementing the monitoring tools yourself. This is immensely beneficial to the incident responders in terms of gaining network visibility as quickly as possible. As time is of the essence in most responses, please don’t make them sit and wait for your network team to get an approval to implement a tap just to find out they put it in the wrong place or it needs to be reconfigured.
If this needs to be sold as an “operational” activity for the network team, tapping and monitoring the network has uncovered many mis-configured or sub-optimal network traffic flows. Everything from firewall rules which are too permissive to clear text traffic which was thought to be sent or received over encrypted channels. Something to keep in mind…who knows, if you ever get around to installing network-based DLP you’re already on your way as you’ll have tapped the network ahead of deployment.