Gotta love buzzwords (actually, you don't)
Lately, every company on Earth (including mine) has been touting their big data experience. Most of them really mean just collecting logs from various systems and tying the alerts together in a sometimes hurried and sloppy manner. Anyone can collect a bunch of data and loosely tie it together, but the goal should be to actually analyze that data and make it actionable. This requires more than a lot of places think. Let me explain.
How do we do things? Glad you asked.
We have a ~30 node Splunk cluster that collects log data from a bunch of clients. We also have remote Splunk implementations that collect the data locally and send the correlated information and any alerts up to our cluster. That's a lot of data. The end result of that though is real honest-to-goodness humans reviewing the data and actually analyzing it, which means our clients then get actionable alerts.
We use lots of sources for this, not only client logs. We tie together published blacklists/watch lists, and so on, with data we find from our own honeypots and IDS sensors we have out there. We also try to employ an amount of anomaly detection and semantic logic to figure out if our clients are being threatened in some way, but possibly not yet being actively attacked. That last one is a lot like mind-reading, but it flows into how we handle the real operational and actionable alerts.
Make sure you have an end goal in mind
I'm writing all this to detail for you how big data in security should work.
No system is perfect, including ours, but you have to have an end goal in mind and our end goal is to provide actionable alerts and intelligence to our customer base. The worst thing you can do, as a customer looking for this kind of service, is say: "I just want you to watch my stuff." You should have a set of questions ready to grill any sales person you're entertaining at the time (including mine, please grill them). Those questions should be tough, ask questions about how their systems work, what they automate, what they don't, what they mean by "monitoring", and most important how they will answer the questions you have of your data they're collecting.
Steer clear of cookie cutter solutions
That last one is the most crucial, if you don't really know what the questions are you need answered then your project is doomed to failure and you'll end up with a cookie cutter solution you may not like much. Of course, if you're just looking to check a compliance box and not really increase your security then any old data collector will do.