Malos Ojos Security Blog

DePaul ISACA Meeting on Zeus – Some Thoughts

by on Nov.11, 2012, under General

I attended an ISACA presentation at DePaul the other evening given by Eric Karshiev from Deloitte on the Zeus malware family and had a few thoughts that I wanted to post (link to the event is at the end).

First, kudos to Eric for a decent presentation even though, self-admittedly, he hasn’t done much public speaking in his career….all I can say is that it only gets easier the more you force yourself to do it.

Second, while the presentation was at the right level of technical detail for an ISACA meeting, and I don’t mean that in a derogatory way ISACA, there were also some really good questions from the students in attendance, which was very encouraging.  I do believe an important first step in defending your organization comes from a through understanding of the threats you face as well as your risk profile based on what your company does, how it does it, and your likelihood of being targeted by attackers in addition to the general opportunistic attacks we see on a daily basis.

That being said, I think there were some great questions that may not have been fully answered during the course of the presentation, and I’d like to list those here and take a shot at answering.  I took the liberty of paraphrasing some questions and consolidating them where it made sense…so here we go:

1. What is the number one attack vector for malware in the recent past?

I made this question more broad and vague as was asked in the presentation, but I did that on purpose so I could answer it a few different ways.  First, social engineering and targeting the users is nothing new, so that is has been and will be an attack vector that is used.  More specifically, client-side browser exploits utilizing vulnerabilities in the browser, and more likely the plug-ins and 3rd party apps such as Adobe and Java (as an example, the new Adobe X 0-day that was, or will be, released soon).  I think this has been standard operating procedure for attackers for the past 4 years given how insecure and under-patched many of these applications are.  We are pretty good at patching the OS layer, but not so good at patching 3rd party applications, especially as they exists on mobile laptops that aren’t always connected to the corporate network.  One thing to keep an eye on in this space is HTML5.  If it ends up being as popular as Java/Flash look for an increase in vulnerability identification and use in attacks.  Don’t believe me?  Look at all of the exploit kits out there (last time I looked at my list I had 34 of them) and look at the CVE’s related to each of the exploit kits…they range from 2004-2011 and most target Java, Flash, and PDFs.

Want to see how insecure your 3rd party apps may be?  Download and run Secunia PSI (free for personal use) and review the report.

2. Is Zeus targeted or opportunistic?  Do I need to be more concerned about protecting a C-level exec, the rest of our users, or both?

Zeus, as a MITM banking Trojan, and by necessity is an opportunistic attack.  If it can steal $5 or $5000 it doesn’t really matter.  The more systems I have compromised the more money I can make, therefore from an attackers perspective it makes sense to spread this as far and wide as I can.  I don’t mean to generalize here, but my advice is to protect all of your user’s systems in the same way when it comes to opportunistic threats.  On the other hand, you do need to be concerned about targeted attacks against executives and ensure they, and their admins, understand that they may be targeted.  For example, we trained the exec at the law firm to help them proactively identify a targeted phishing attack.  One day we received a call from an exec stating that they received an email, it didn’t look legit, and had a PDF attachment that they didn’t open.  We immediately reviewed the attached PDF and it was weaponized (although poorly) to infect the system with a dropper and connect back to C2 to get a binary.  When we looked at the content of the email message we noticed that it was unique enough to comb through all received mail message for the same email and attachment.  What we noticed was that 5 other messages like the one we had in our possession were sent, but only to executives of the firm.  On top of that, each had a weaponized PDF attachment that was different from the others but had the same dropper functionality.  The polymorphism was likely in place to evade IDS, mail filters, and AV…all of which were bypassed without issue.

3. You said AV isn’t effective given that it is signature based.  What else can we do to protect users from being infected, and if we can’t protect them how can we detect malware?

This was a great question, and the one that actually spurred me to write this post, that went unanswered (at least to my satisfaction).  Yes, part of AV detection is signature based, but so are mail filters and IDS/IPS systems.  It is true that these commodity controls can protect us from the “known” malware that is floating around the internet, but it can’t protect us from new malware…I think this is an obvious statement given the number of systems that are compromised on a regular basis.

That being said, there are some controls we can implement that aren’t signature based that can detect malware based on behavior.  Since I mentioned social engineering, it may be helpful to give our users a hand in determining the “goodness or badness” of emails they receive by ranking them.  Email analytics is a good start, and products have now sprung up that play in this space.  ProofPoint is an example of a tool that may empower your users and allow them to make better decisions about emails they receive and what to do with them.  It isn’t full-blown security data analytics, but it is a start.  Another example of a vendor in this space is FireEye with their email and web products, which can identify executable attachments in email and those received from clicking on internet links (or drive-by downloaded), analyze them in a sandbox, and make a determination of the as to their “badness”.  Damballa is also another product focused on behavior analysis of malware as it uses the network…this makes sense as malware which doesn’t communicate to its owner isn’t very valuable.  Their technology makes use of the known C2 systems as well as DGA-based malware generating many resolution requests and getting a bunch of NX’s back.  Finally, Netwitness is an invaluable tool in both monitoring and incident response as it gives the visibility into the network that we have been lacking for so many years.  And yes, there is a lot of overlap in these tools, so expect some consolidation in the coming years.

I don’t mean to push vendors as a solution and would never throw technology at a situation to fix the underlying root causes  – unpatched OS, browsers, and 3rd party applications open a nice attack surface for the bad guys.  Why do we allow our users full control of their system?  Do they all need to be admin?  We also don’t seem to be doing a great job of monitoring the network and all of the systems we own…what bothers me most is that the attackers are attacking us on home turf.  We own the battlefield and keep getting our a$$es handed to us.

4. There was a comment on the use of Palo Alto and Wildfire in relation to the use of the cloud and how that may help.

Most all of the technologies mentioned above use the same mechanism, and this is nothing new as AV vendors have been doing this since they realized they could get good intel from all of their customers.  My only caution is that the benefit realized from sending all bad binaries to a cloud service for analysis is that it is dependent on how good that analysis is.

So to close, my suggestion to anyone interested in malware prevention, detection, and analysis is that there are some great resources on the internet as well as some decent classes you can take to better understand this threat.  If analysis is your thing then I’d recommend Hacking – The Art of Exploitation and Practical Malware Analysis as some good reads.  Setup a lab at home and experiment with some of the tools and techniques used by past and current malware…nothing beats hands on work in this space as the more you know the better you are at malware identification and response.

Link  to the presentation site – http://events.depaul.edu/event/zeus_malware_family_the_dark_industry#.UJ1baoXLDEs

2 Comments more...

OK, maybe you should do more than tap your network before an incident…

by on Sep.19, 2012, under General

In looking back at the last post about tapping your network prior an incident I thought to myself, why did I stop there…or more appropriately, why was I focused on having the right infrastructure in place?  Perhaps it was out of frustration of showing up to a client and wondering why it was so difficult to start quickly monitoring the environment to get a handle on the issue we may, or may not, be dealing with.  But when I thought about it what I really needed was a Delorean with a flux capacitor and a crazy Doc Brown of my own so I can travel back in time (no, I don’t own a vest jacket or skateboard but may on occasion rock some Huey).  I don’t have a time machine and can’t go back in time, but what I can do is make some recommendations that we start capturing information that may be useful in an incident response PRIOR to one happening.  I don’t mean to dismiss the ability to rapidly scale up your monitoring efforts during a response, even if that means dropping new tools into the environment or calling on 3rd parties to assist, but would be doing an injustice if I didn’t discuss what you should be collecting today.

Normally I’d be frustrated with myself for recommending that you collect information, logs, data, etc. and then do nothing with them.  But, is that really a bad thing?  I thought back to when I started at the law firm and many years ago when I asked what we monitored I was told we had network perimeter logs and that they were being sent to an MSSP for storage.  Were we doing anything with these logs?  No.  Was the MSSP doing “some science” to them and telling us bad things may have been happening?  No.  So, at first I questioned the value, and sanity, of the decision to capture but do nothing with our information.  But the more I thought about it the more I understood that IF something had happened I would/may have the information I need to start my investigation.  Forensics guys may know this as the concept of the “order of volatility”, or how quickly something is no longer available for analysis.  I’d say besides system memory that network connections would be very high up on that list.  And if I weren’t at least capturing and storing these somewhere then they would be lost in the past because of their volatility.  So, it isn’t that bad to just collect data, just in case.

I’d also like to temper the aspirations of folks who want to run out and log everything for the sake of logging everything.  What I’d much rather see is that your logging and data collection be founded on sound principles.  What you choose to focus on should be based on your risk, or what you’re trying to protect/prevent, and I hope to highlight some of this in the rest of this post.  As an example, if you are logging successful authentications, or access to data it should be focused on your most valuable information.  This will, for obvious reasons, vary from organization to organization.  A manufacturing company is more likely trying to protect formulas, R&D data, plant specifications and not medical records like an organization in healthcare would.  OK, on to the major areas of logging to focus on, or at least something to consider:

Perimeter/Network  Systems

  • Firewalls
  • Web proxies
  • Routers/switches (netflow)
  • DHCP/DNS

Application/Database

  • Access to sensitive data via applications
  • Database administrative actions/bulk changes
  • Administrative access to DBs, applications, and systems (focus on critical systems first)
  • SharePoint and other data repository access
  • Email systems

Security Controls

  • IDS/IPS
  • AV, Anti-malware, and end-point controls
  • File integrity monitoring systems
  • DLP and data loss prevention tools

Contextual Information

  • Identity management systems and access governance systems
  • Vulnerability information
  • Operational data such as configuration information and process monitoring

And as Appendix A.2.1 of NIST 800-86 says, have a process to collect the data that is repeatable and be proactive in collecting data – be aware of the range of sources and application sources of this information.

While I realize the recommendations of this post are rather remedial, I still find organizations who haven’t put the right level of thought into what they log and why.  Basically, the recommendation I may make is to first understand what you may log today, identify gaps in the current set of logs and remediate as necessary, and design your future state around a solid process for what you plan to collect and what you plan to do with it.  More to come…

 

3 Comments more...

Tap Your Network BEFORE You Have an Incident

by on Jul.07, 2012, under General, Incident Response

In responding to incidents there is one thing that stands out that I felt deserved a post and that is the topic of network taps and visibility.  While some large companies often have the necessary resources (i.e. money, time, engineers, other tools which require visibility into network traffic, etc.) to install and maintain network traffic taps or link aggregators, the number of companies I run into without ANY sort of tap or aggregator infrastructure surprises me.  While it depends on the type of incident you’re dealing with, it is quite often the case that you’re going to want, better yet need, a very good view of your network traffic down to the packet level.

If you’re not convinced imagine this scenario: During a routine review of some logs you see that you have traffic leaving your US organization which is going to an IP address that is located somewhere in Asia.  It appears to be TCP/80 traffic originating from a host on your network, so you assume it is standard HTTP traffic.  But then you remember that you have a web proxy installed and all users should be configured to send HTTP requests through the proxy…so what gives?  At this point your only hope is to view the firewall logs (hopefully you have these enabled at the right level), or you can go out and image the host to see what sites it was hitting and why.  But, if you had packet level inspection available a simple query for the destination and source address would confirm if this is simply a mis-configured end user system, a set of egress rules the firewall that were left behind that allow users to circumvent the proxy, or if it is C2 traffic to/from an infected host on your network.

Having taps, SPAN/mirror ports, or link aggregators in place PRIOR to an incident is the key to gaining visibility into your network traffic, even if you do not possess the monitoring tools today.  It allows response organizations to “forklift” a crate of tools into your environment and gain access to the network traffic they need to begin the investigation.  The main benefit of tapping your infrastructure prior to an incident is that you don’t need to go through an emergency change control at the start of the incident just to get these taps, SPAN, or aggregators installed.  This is also technology that your network team may not be familiar with configuring, installing, or troubleshooting.  So setting up your tapping infrastructure up front and being able to test it under non-stressful conditions is preferred.  That being said, it is also important to remember that there are pros and cons on how you pre-deploy your solution, both in terms of technology and tap location.

A couple of questions should be answered up front when considering how to approach this topic:

  1. Can our current switching infrastructure handle the increased load due to the configuration of a SPAN or mirror port?
  2. Will we have multiple network tools (i.e. Fireeye, Damballa, Netwitness, Solera, DLP, etc.) that need the same level of visibility?
  3. If we tap at the egress point what is the best location for the tap, aggregator, or SPAN?
  4. Do we know what traffic we are not seeing and why?

 

Taps vs. Link Aggregators vs. SPAN/mirror

The simplest way to gain access to network traffic is to configure a switch, most likely one near your egress point, to SPAN or mirror traffic from one or more switch ports/VLANs to a monitor port/VLAN which can be connected to the network traffic monitoring tool(s).  The downside of SPAN ports is that you can overwhelm both the port and/or the switch CPU depending on your hardware.  If you send three 1G ports to a 1G SPAN port, and the three 1G links are all 50% saturated at peak, you will drop packets on the SPAN port shortly after you surpass the bandwidth of the 1G port (oversubscription).  The safest way to use a SPAN in this case is to mirror a single 1G port to a 1G mirror port.  Also consider how many SPAN or mirror ports are supported by your switching hardware.  Some lower end model switches will only support a single mirror port due to hardware limitations (switch fabric, CPU, etc.), while more expensive will be able to support many more SPAN ports.  I’m not going to get into local SPAN vs. RSPAN over L2 vs. ERSPAN over GRE…that is for the network engineers to figure out.

Passive and active taps can alleviate some of the issues with dropped packets on a switch SPAN as they sit in-line to the connection being tapped and operate at line speed.  The drawback is they may present a single point of failure as you now have an in-line connection bridging what is most likely your organization’s connection to the rest of the world.  Also, keep in mind that passive taps have two outputs, one for traffic in each direction so you’ll need to ensure the monitoring tools you have or plan to purchase can accept this dual input/half duplex arrangement.  Active taps on the other hand are powered so you’ll want to ensure you have redundancy on the power supply.

The last type of tap isn’t really a tap at all, but a link aggregator which allows you to supply inputs from either active/passive taps or switch SPAN ports which are then aggregated and sent to the monitoring tool(s).  The benefit of an aggregator is that is can accept multiple inputs and supply multiple monitoring tools.  Some of the more expensive models also have software or hardware filtering, so you can send specific types of traffic to specific monitoring tools if that is required.

Last but not least are the connection types you’ll be dealing with.  Most monitoring tools mentioned in this post accept 1G copper up to 10G fiber inputs, depending on the tool and model.  You also need to make sure your taps and/or aggregators have the correct types of inputs and outputs that will be required to monitor your network.  If you’re tapping the egress point chances are you’re dealing with a 1G copper connection, as most of us rarely have a need for more than 1G of internet bandwidth.  If you’re tapping somewhere inside your network you may be dealing with 1G, 10G, or fiber connections or a combination (i.e. 10/100/1000 Base-T, RJ-45 Ethernet,  1000 Base-Sx/Lx/Zx Ethernet multimode or singlemode), so keep this in mind as you specify your tapping equipment.

Location – Outside, Inside, DMZ, Pre or Post Proxy?  What About VM Switches?

Next is the issue of location of the network tap and the answer to this really depends on what level of visibility you require.  At a minimum I’d want to tap the ingress/egress points for my network, that is any connection between my organization and the rest of the world.  But that doesn’t quite answer the question as I still have options such as outside the firewall, directly inside the firewall (my internal segment), or just after my web proxy or IPS (assumes in-line) or inside the proxy.

There are some benefits and drawbacks to each of these options; however I’m most often interested in traffic going between my systems and the outside world.  The answer mainly depends on your network setup and the tools you have (or will have) at your disposal.  If you tap outside the firewall then you can see all traffic, both traffic which is allowed and that which may be filtered (inbound) by the firewall.  The drawback is both noise and the fact that everything appears to originate from the public IP address space we have as I’m assuming the use of NAT, overload NAT, PAT, etc. is in use in 99% of configurations.  The next point to consider is just inside the firewall; however that depends on where you consider the inside to be.  If we call it the inside interface (that which our end users connect through) then I will gain visibility into traffic pre-NAT which shows me the end-user’s IP address, assuming an in-line (explicit) proxy is not being used which would then make all web or other traffic routed through the proxy to appear to originate from the proxy itself.  Not forgetting the DMZ, we may also tap our traffic as it leaves the DMZ segment as well through a tap or SPAN as that will allow for monitoring of egress/ingress traffic but not inter-DMZ traffic.

Pre or post-proxy taps need to be considered based on a few factors as well.  If it is relatively simple to track a session that is identified post-proxy back to the actual user or their system, and is it cheaper for me to tap post-proxy, then go for it.  If we really need to see who originated the traffic, and what that traffic may look like prior to being filtered by a proxy, then we should consider tapping inside the proxy.  In most situations I’d settle for a tap inside the proxy, just inside the firewall prior to NAT/PAT, and just prior to leaving the DMZ segment.  To achieve this you may be looking at deploying multiple SPANs/taps and using a link aggregator to aggregate the monitored traffic per egress point.

Finally, what about all the virtual networking?  Well, there are point solutions for this as well.  Gigamon’s GigaVue-VM is an example of new software technology that is allowing integration with a virtual switch infrastructure.  While this remains important if we need to monitor inter-VM traffic, all of these connections out of a VM server (i.e. ESXi) need to turn physical at some point and are subject to the older physical technologies mentioned above.

Limitations

This should be a standard section on encryption and how it may blind the monitoring tools.  Some tools can deal with the fact that they “see” encrypted traffic on non-standard ports and report that as suspicious.  Some don’t really care as they are looking at a set of C2 destinations and monitoring for traffic flows and amounts.  If you’re worried about encryption during a response you probably should be…and if you’re really concerned consider looking into encryption breaking solutions (i.e. Netronome).  Outside of the encryption limitation, after you deploy you tapping infrastructure your network diagrams should be updated (don’t care who does this, just get it done) to identify the location, ports, and type of component of your solution along with any limitations on traffic visibility.  Knowing what you can’t see in some cases is almost as important as what you can see.

Final Thoughts

Find your egress points, understand the network architecture and traffic flow, decide where and how to tap, and deploy the tapping infrastructure prior to having a need to use it…even if you don’t plan on implementing the monitoring tools yourself.  This is immensely beneficial to the incident responders in terms of gaining network visibility as quickly as possible.  As time is of the essence in most responses, please don’t make them sit and wait for your network team to get an approval to implement a tap just to find out they put it in the wrong place or it needs to be reconfigured.

If this needs to be sold as an “operational” activity for the network team, tapping and monitoring the network has uncovered many mis-configured or sub-optimal network traffic flows.  Everything from firewall rules which are too permissive to clear text traffic which was thought to be sent or received over encrypted channels.  Something to keep in mind…who knows, if you ever get around to installing network-based DLP you’re already on your way as you’ll have tapped the network ahead of deployment.

Leave a Comment more...

Cyber Security Experts?

by on May.17, 2012, under General

Reading an article on nbcchicago.com titled “Experts Warn Laptops Could Be Targeted During NATO Summit” made me laugh…specifically this quote, “The chief technology officer at SRV Network Inc. in Chicago told the Sun-Times computer users should make sure their anti-virus software is updated”.  Really?  Sure, if you want to protect yourself against commodity malware that has been floating around for some time…it still amazes me that so-called security experts make this recommendation.  Don’t get me wrong, it is a very safe recommendation to make and I don’t mean to imply that you shouldn’t run updated antivirus.  What I don’t think this statement conveys is that there is malware that can be built, easily and inexpensively, that bypasses your antivirus control regardless of how “up-to-date” the signatures may be.  I hate that these statements give many people a false sense of security…”Oh, nothing can happen to me, I have antivirus enabled and it is up-to-date”.  Maybe it was the brevity of the article in this case that got to me, but I’d probably make some better recommendations here, including:

  1. Update both the operating system you use as well as any applications and browser plug-ins from a known good internet connection (from a connection you own).
  2. Harden the system, disable unnecessary services and remove unnecessary applications.
  3. Consider disabling scripts in your browser, using the No-Script plug-in for Firefox as an example.
  4. Disable services with listening ports where possible.  For example, in Windows, there is no need to run file and printer sharing on a laptop, so turn it off.
  5. Consider using a host-based firewall which will limit network borne attacks against your system.
  6. Connecting to “known” wireless networks is a start, but nothing guarantees that you’re actually connecting to a “good” access point.  It is fairly trivial to run a fake access point and proxy connections, so on that note:
    1. Turn off beaconing so your system isn’t actively looking for and connecting to access points on your behalf.  As an attacker I can use these beacons to then setup a fake access point you’ll automatically connect to.
    2. If you own the WiFi access point you’re connect to it is trivial to verify the MAC address of the AP you’re connecting to, do it.
    3. If you do connect to an “open” access point you should consider using a VPN connection to encrypt the wireless traffic.  Using SSL/TLS is no longer a guarantee given side jacking tools like Firesheep.
    4. Don’t assume a WiFi network you setup for a bunch of people to use is “unhackable”.  Many tools exist to break WPA-PSK and it gets worse if you’re running a router that is vulnerable to WPS pin attacks.  if you’re running WPA-Enterprise then I’m impressed.
    5. If you’re really paranoid, you can throw a VPN, VPS and ToR into the mix as well and run the traffic destined for the internet securely through another system in another country.  Ever see that big data center someone is building in Utah?  How about orange doors at AT&T?  Paranoid yet?
    6. All this talk of WiFi, why not just bust out a 4G hotspot instead…protected of course.
    7. If you’re extremely paranoid how about running a throw-away system or a something off of a live disk like BT5?
    8. Finally, practice restraint in your browsing…don’t click yes to everything without reading, take certificate errors seriously, and try not to get caught up in the excitement.

I do realize some of my recommendations above may be over the average user’s head, but we need to do better than making a blanket “update your antivirus” statements if we really want to empower users or assist them in protecting themselves.  I also think if you search there is probably a guide, better than what I typed up in 10 minutes, posted somewhere online that you could use.

All of the above makes no mention of “why” someone would want to break into users laptops.  Sure, there will be a lot of people around using WiFi and mobile data networks and such to connect, share, post images, video, stories, etc.  I’m just not seeing how this is any different from any other situation, such as travelling and connecting to a hotel’s WiFi network, or at the airport, or even as I sit here on my own network at home.  Point is, you’re being attacked every day regardless of where you are, so I just don’t get why we are making a big deal out of this because we added NATO to the title.

I’m cranky and need more coffee…

Leave a Comment more...

2012 North American CACS Conference

by on May.06, 2012, under General

I’ll be speaking at the North American CACS Conference for ISACA in Orlando, FL on May 7th. I’m on a panel discussing Emerging IT Risks @ 10:15am and @ 3:30pm I’m presenting on Auditing Mobile Computing.

Leave a Comment more...

Update on Ethically Teaching Ethical Hacking

by on May.04, 2012, under General

I have to give DePaul University some kudos on this topic. They came around and added my course to the regular course catalog for the Computer, Information, and Network security program as CNS388/488 – Security Testing and Assessment. It is a foundational level course on ethical hacking, the methodology, and the tools used in these types of assessments. I’m happy to see that some schools are coming around and it will be available in the coming Fall quarter.

Leave a Comment more...

RSA/EMC Webinar on Security Resilience

by on May.04, 2012, under General, Incident Response

I also presented on a RSA/EMC webinar on security threats and building the right controls back in January that I never posted. The link to the event is Here.

Leave a Comment more...

Cyber Threat Readiness Webinar – May 3rd, 2012

by on Apr.25, 2012, under General, Incident Response

I’m presenting on Cyber Threat Readiness on a webinar on May 3rd with Mike Rothman (President at Securosis) and Chris Petersen (Founder and CTO at Logrythm)

Register Here

Most IT security professionals readily acknowledge that is only a matter of time before their organizations experience a breach, if they haven’t already.   And, according to the recent Cyber Threat Readiness Survey , few are confident in their ability to detect a breach when it happens.

In this webcast, three industry experts will discuss the current state of cyber threats and what’s required to optimize an organization’s Cyber Threat Readiness.  Given that it’s “when” not “if” a breach will occur, would you know when it happens to you?  Attend this webcast and increase your confidence in answering “Yes”.

Featured Speakers

Mike Rothman, President, Securosis

Deron Grzetich, SIEM COE Leader, KPMG

Chris Petersen, Founder & CTO, LogRhythm

 

Leave a Comment more...

Is Cloud-based SIEM Any Better?

by on Apr.20, 2012, under General, Log Management

In flipping through some articles from the various publications I read (wow, did I just sound like Sarah Palin?) I came across this comment in an article on SIEM in the cloud:

“Another problem with pushing SIEM into the cloud is that targeted attack detection requires in-depth knowledge of internal systems, the kind found in corporate security teams. Cloud-based SIEM services may have trouble with recognizing the low-and-slow attacks, said Mark Nicolett, vice president with research firm Gartner.” (http://searchcloudsecurity.techtarget.com/news/2240147704/More-companies-eyeing-SIEM-in-the-cloud)

To give some context to the article it was more about leveraging “the cloud” to provide SIEM services for the SMB market, which doesn’t have the staff on hand to manage full-blown SIEM deployments, than it was about detecting attacks, but I digress…

I agree and disagree. I agree, and have said this before in my arguments for and against using an MSSP for monitoring. While the context of the article was using the cloud to host the data (and the usual data protection arguments came up) isn’t this just another case of outsourcing to a 3rd party provider and calling it cloud? Data security issues aside, it doesn’t matter if the MSSP uses their own infrastructure, or some cloud provider’s infrastructure, the monitoring service is what I’m paying for. I’ve said this before, and I’ll say it again, a MSSP is “a SOC” not “your SOC”. They do a fair job at detecting events, but may fail to put these into a business context that makes sense. Again, this is something you can try to get them to do, but personal experience has taught (more like biased) me to believe that it can’t be done.

But I’m also going to disagree and say that it isn’t only cloud-based SIEM providers who miss the low-and-slow attacks. I’d argue internal security teams are as likely to miss them as well based on the maturity of monitoring I see at organizations and the surrounding IR processes. I don’t mean to sound negative, but few organizations have built a solid detective capability that gets down to the level of very carefully crafted attacks which may not result in a lot of traffic and/or alerts. In addition, the alerts as defined by your SOC/IR team may not be suited to catch these attacks, and even if they are we still need to ensure we have the right trigger sources and thresholds without overwhelming the analysts who deal with the output.

Either way, my point is that we aren’t very good at this….yet. What I often see lacking is the level of knowledge in the analysts who review the console and even some of the program architects who define the alerts for the “low and slow” attacks. In terms of maturity we are still struggling with getting the highly visible alerts configured correctly for our environment or getting the SIEM we purchased 2-3 years ago to do what we want it to do. Vendors are doing their part to make deployment and configuration simpler while still allowing flexibility in alert creation and correlation. But, I don’t think that will get us to the level of maturity needed to identify the stealthy attacks…I do think it is going to come down to us providing “security intelligence” versus a monitoring service, but I’ll hold on to that for a future post.

To answer the question of the title, no, not yet. Again, I think what we are talking about here is just outsourcing using the “C” word and I argue the same points I would if I just said MSSP in place of cloud. Business context issues aside, it is better than doing nothing and may serve a purpose to fill a void, especially if the organization is small enough that they will never bring this function in house. One thing that the attackers understand is that although the SMB market may not be as juicy a target as the large orgs, they still have some good data that is worth the effort…and even less risk since they rarely have solid security programs.  So, is it better than nothing?  Sure.  Is it the correct answer today?  Maybe.  Will it detect a low and slow attack?  No, but you’re chances with in internal program aren’t that much better today….and they need to be.

Leave a Comment more...

SIEM Deployments Don’t Fail…

by on Apr.10, 2012, under General, Log Management

Let me restate the title, SIEM deployments don’t fail.  The technology to accept logs, parse and correlate events has existed in a mature state for some time now.  The space is so mature that we even have a slight divide between SIEMs that are better suited for user tracking and compliance and some that are better at pure security events depending on how the technology “grew up”.  So the pure technical aspects of the deployment are generally not the reason your SIEM deployment fails to bring the value you, or your organization, envisioned (no pun intended).

Remember that old ISO thing about people, process, AND technology?  Seems we often forget the first two and focus too much on the technology itself.  While I’d like to say this is limited to smaller organizations the fact it that it is not.  The technology simply supports the people who deal with the output (read: security or compliance alerts) and the process they take to ensure that the response is consistent, repeatable, tracked, and reported.  That being said we also seem to forget to plan out a few things before we start down the SIEM path in the first place.  This post aims to provide you with the “lessons learned” from both my own journey as well as that of what I see my clients go through in a Q and A format.

Question 1. Why are we deploying SIEM or a log management/monitoring solution?

The answer to this is most likely going to drive the initial development of your overall strategy.  The drivers tend to vary but generally fall into the following categories or issues (can be one or more):

  1. The company is afraid of seeing their name in the paper as the latest “breached” company (i.e. is afraid of Anonymous due to their “ethicalness” or possibly afraid of what is left of Lulzsec)
  2. A knee-jerk reaction to being breached recently and the checkbook is open, time to spend some money…
  3. Had some failure of a compliance requirement (i.e. PCI, e-Banking) that monitoring solves (from a checkbox perspective)
  4. Have finally graduated from simply deploying “preventative” controls and realize they need to detect the failure (which happens more than we know) of those controls

What are just as important are the goals of the overall program.  Are we more concerned with network or system security events?  Are we focused on user activity or compliance monitoring?  Is it both?  What do we need to get out of this program at a minimum level and what would be a nice to have?  Where does this program need to be in the next 12 month?  The next 3 years?  Answering these questions helps answer the question of “why”.  The purpose and mission must be defined before we even think about looking at the technology to support the program.  While this seems like a logical first step most people start by evaluating technology solutions and then backing into the purpose and mission based on the tool they like the most.  Remember, technology is rarely the barrier.

Question 2. Now that we are moving forward with the program, how do we start?

The answer to this one will obviously depend on the answers to some of the questions above.  Let’s assume for a moment, and for simplicity of this post, that you have chosen security monitoring as the emphasis of the program.  Your first step is NOT to run out to every system, application, security control, and network device and point all of the logs at the highest (i.e. debugging) level at the SIEM.  Sure, during a response having every log imaginable to sort through may be of great benefit, however at this stage I’m more concerned that I have the “right” logs as opposed to “all” logs.  One of the reasons I see this “throw everything at the SIEM and see what sticks” idea may be partially driven by the vendors themselves or an overzealous security guy.  I could image a sales rep saying “yes, point everything at us and we’ll tell you what is important as we have magical gnomes under the hood who correlate 10 times faster and better than our competition”.  Great, as long as what is important to you exactly lines up with what the vendor thinks then go for it (joking, of course).

The step that seems most logical here is to define what events, if they occur, are most important given your organization, business, structure, and the type and criticality of data you store or is most valuable.  If we define our top 10, 20, 30, etc. and rank these events by criticality we have started to define a few things about our program without even knowing it.  First, with a list of events we can match these up to the log sources that we would need in order to trigger an alert in the system.  Do we need one event source and a threshold to trigger?  Or is it multiple sources that we can correlate?  Don’t be surprised if your list is a mixture of both types.  Vendors would love for us to believe that all events are the result of their correlation magic, but in reality that just isn’t true.  We can take that one step further and define the logs we would need to further investigate an alert as well.  Second, we started to define an order of criticality for both investigation and response.  Given the number of potential events per day and a lack of staff to investigate every one, we need to get to what matters which should be our critical or higher risk events first.

One thing to keep in mind here as well is to not develop your top “x” list in vacuum.  As part of good project planning you should have identified the necessary business units, lines, and resources that need to be involved in this process.  Security people are good at thinking about security, but maybe not so much about how someone could misuse a mainframe, SAP, our financial apps and so on.  Those who are closer to the application, BU, or function may end up being a great resource during this phase.

And finally, events shouldn’t be confined to only perimeter systems.  If we look at security logging and are concerned about attacks we need to build signatures for the entire attack process, not just our perimeter defenses which fail us 50% of the time.  Ask yourself, if we missed the attack at the perimeter, how long would the attacker have access to our network and systems until we noticed?  If the Verizon DBIR report is any indication the answer may be weeks to months.

Question 3. I’ve defined my events, prioritized them, and linked them to both trigger log sources and investigation log requirements.  Now what?

Hate to say it, but this may be the hardest part of the process.  Hard because it assumes your company has asset management under control.  And I don’t mean being able to answer where a particular piece of hardware may be at a given moment.  I do mean being able to match an asset up to its business function, use, application, support, and ownership information from both the underlying services layer (i.e. OS, web server, etc.) as well as the application owner.  All of this is in addition to the standard tracking of a decent asset management program such as location, status, network addressing, etc.  If you lack this information you may be able to start gathering the necessary asset metadata from various sources that may (hopefully) already exist.  Most companies have some rudimentary asset tracking system, but you could also leverage output from a recent business impact analysis (BIA) or even the output from the vulnerability assessment process…assuming you perform periodic discovery of assets.  Tedious?  Yes.

Let’s assume we were able to cobble something together that is reasonable for asset management.  Using our top “x” list we can identify all of the log sources and match those up to the required assets.  Once we know all of the sources we need to:

  1. Ensure that all assets that are required to log, based on our events, have logging enabled and to the correct level, and;
  2. That as new assets are added which match our a log source type from our event list go through step 1 above, and;
  3. The assets we do have logging to the SIEM continue to log until they are decommissioned.  If they stop logging we can investigate as to why.

One client I had called this a Monitored Asset Management program or something to that effect, which I thought was a fitting way to describe this process.  This isn’t a difficult as one may think given that the systems logging into our SIEM tend to be noisy, so a system that goes dead quite for a period of time is an indicator of an potential issue (i.e. it was decommissioned and we didn’t know, someone changed the logging configuration, or it is live yet has an issue sending (or us receiving) the logs).  One thing that does slip by this process is if someone changes the logging level to less than what is required for our event to trigger, thus blinding the SIEM until the level is changed back to the required setting.

In addition to the asset management we should test our event for correctness at this point.  We should be able to manually trigger each event type and what as it comes in to the SIEM or dashboard.  I can admit I have made this mistake in the past, believing that there is no way we could have screwed up a query or correlation so that the event would never trigger…but we did.  You should also have a plan to test these periodically, especially for low volume high impact type of events to ensure that nothing has changed and the system is working as designed.

Question 4. To MSSP, or not to MSSP, that is the question.  Do you need an MSSP and if so what is their value?

This is also a tough question to answer as it always “depends”.  Most companies don’t have the necessary people, skills, or availability to monitor the environment in a way which accomplishes the mission we set for ourselves in step 1.  That tends to lead to the MSSP discussion of outsourcing this to a 3rd party who has the people and time (well, you’re paying for it so they better) to watch the events pop up on the console and then do “something”.

Let me start with the positive aspects of using an MSSP before I say anything negative.  First, they do offer a “staff on demand” which may be a good way to get the program off the ground assuming you require a 24×7 capability.  That is a question that needs to be answered in step 1 as well, and you should ask yourself if we received an alert at 3am, do we have the capability to respond or would that be taken care of by the first security analyst on our team in the morning?  24×7 monitoring is great, assuming you have the response capability as well.  Second, they do offer some level of comfort in “having someone to call” during an event or incident.  They tend to not only offer monitoring services but also may have response capabilities, threat intelligence information (I’ll leave the value of that one up to you), and investigation.

Now on to the negatives of using an MSSP.  First, they are “a SOC looking at a SIEM console”, and not “your SOC who cares about your business”.  The MSSP doesn’t view the events in the same business context and you unless you give them that context and then demand that they care.  Believe me, I’ve tried this route and it leads to frustrating phone calls with MSSP SOC managers and then the sales guy who offers some “money back” for you troubles.  Even if you provide the context of the system, network architecture, and all the necessary information there is no guarantee they will use it.  To give you a personal example we used an unnamed MSSP and would constantly receive alerts from them stating that a “system” was infected as it was seen browsing and then downloading something bad (i.e. JavaScript NOOP sled or infected PDF).  That “system” turned out to be the web proxy 99.9% of the time.  To show how ridiculous this issue was all you had to do was look in the actually proxy log record, which was sent to them, to determine the network address (and host name) of the internal system that was involved in the event.  Side note, they had a copy of the network diagram and a system list which showed the system by name, network address, and function.  Any analyst who has ever worked in a corporate environment would understand the stupidity of telling us that the web proxy was potentially infected.  Second, MSSPs, unless contractually obligated, may not be storing all of the logs you need during an incident or investigation.  Think back to the answer to question 2 for a moment where we defined our event, trigger logs, and logs required to further investigate an event.  What happens if you receive an event from the MSSP and go back to the sources to pull the necessary logs to investigate only to find they were overwritten?  As an example from my past, and this depends on traffic and log settings, but Active Directory logs at my previous employer rolled over every 4 hours.  If I wasn’t storing those elsewhere I may have been missing a necessary piece of information.  There are ways around this issue which I plan on addressing in a follow up post on SOC/SIEM/IR design.

Question 5. Anything else that I need to consider?  What do others miss the first time around, or even after deploying a SIEM?

To close this post I’d offer some additional suggestions besides some of the (what I feel are obvious) suggestions above.  People are very important in this process, so regardless of the technology you’re going to need some solid security analysts with skills ranging from log management to forensics and investigations.  One of the initial barriers to launching this type of program tends to be a lack of qualified resources in this area.  It may be in your best interest to go the MSSP route and keep a 3rd party on retainer to scale your team during an actual verified incident.  Also, one other key aspects of the program must be a way to measure the success, or failure, of the program and processes.  Most companies start with the obvious metric of “acknowledged” time…or the time between receiving the event and acknowledging that someone saw it and said “hey, I see that”.  While that is a start I’d be more concerned that the resolution of the event was within the SLAs we defined as part of the program in the early stages.  There is a lot more I could, but won’t, go into here on metrics which I’ll save for a follow up post.  In my next post I’ll also talk about “tiering” the events so that the events with a defined response can take an alternate workflow, and more interesting events which require analysis will be routed to those best equipped to deal with them.  And finally, ensuring that the development, or modification, of the overall incident response process is considered when implementing a SIEM program.  Questions such as, how will SIEM differ from DLP monitoring and how does SIEM compliment, or not, our existing investigative or forensics tool kit, will need to be answered.

Conclusion

To recap the simple steps presented here:

  1. Define your program with a focus on process and people as opposed to a “technology first” approach
  2. Define the events, risk ranked, that matter to your organization and link those to both the required trigger log sources as well as logs required to investigate the event
  3. Ensure that the required logs from the previous step are available, and continue to be available to the SIEM system
  4. Consider the use of an MSSP carefully, weighing the benefits and drawbacks of such an approach
  5. Lots of other items in terms of design, workflow, tracking and the like need to be considered (hopefully I’ll motivate myself to post again with thoughts on SOC/SIEM/IR design considerations)

While I think the list above and this post are quite rudimentary I can admit that I made some of the mistakes I mentioned the first time I went through this process myself.  My excuse is that we tried this for ourselves back in 2007, but I find little excuse for larger organizations making these mistakes some 5 years later.  Hopefully you can take something from this post, even if it is to disagree with my views…just hope it encourages some thought around these programs before starting on the deployment of one.

Leave a Comment more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

Links for tools and such...