There comes a time in every developer's life where they spend way too long trying to work around a problem that was already easily solvable in some other manner. If you haven’t had one of these moments, keep developing - you will. You’ll develop a function, and you’ll re-develop it, and you’ll get to version 85 and you still won’t be quite where you wanted. Hopefully these instances are rare and when they do happen, you learn a valuable lesson.
I ran into one of these issues over the past 6 months while developing the Nest Add-on for Splunk. This is an Add-on I developed with Tim Baldwin (creator of the Broken Hosts App for Splunk). The goal of the Nest Add-on for Splunk is to collect data from your Nest devices in the simplest way possible; that’s always been the goal.
Because of the way OAuth 2.0 works, Tim and I also had to create a Splunk Add-on for Nest “Product” in the Nest Developer portal. This product is used to register Nest user accounts and make requests on their behalf to the Nest API. There are many different benefits of this type of authentication. One of those benefits though, is that Nest can limit how many people are using that product to access their API.
Recently, Tim and I maxed out the limit of users for our Nest “Product”, something that didn’t take us long to do. This was a good problem to have, because it meant we had a great adoption rate. It came to a point, however, where Nest said, “Alright, this is great that you have a popular product, but now we need to make sure you’re playing by OUR rules”. Unfortunately, we weren’t exactly playing by their rules.
I want to be clear, I don’t fault Nest in any way for taking the time to review Products. I think the rules they held us to make complete sense. The problem Tim and I had was that we wanted to ensure that a few different things were happening all at the same time:
PLAYING BY NEST’S RULES
Any company who allows others to use their API as a service is doing just that: providing a service. As a result they have the right to say what’s okay and what’s not okay. This is the internet after all, and people are going to try and abuse it. We knew we had to keep this as the focus for development.
GOOD SPLUNK EXPERIENCE
I hope a lot of the other IoT (Internet of Things) products that integrate with Splunk start to utilize the flow we’ve implemented with this Add-on and AWS Lambda. Without AWS Lambda, Tim and I were stuck in a bit of a rut. Our only option to continue growing was one of two things.
The first and easiest (from a development perspective) was to force every user, who wants to use this Add-on, to create their own Nest “Product”. This would mean each user would have to create a developer account, create a Nest product, and still do all of the authorization/authentication before even getting to Splunk.
This may be a big assumption, and my own personal belief, but I think forcing every user to go through that many steps would result in a much lower adoption rate. I don’t want to seem skeptical of every Splunk user, because sure, some of you wouldn’t have minded a few pages of documentation. In fact, some of you might have even liked it. I was going with the 80/20 rule though. I wanted most of our users to have an easier time getting off the ground, even if it meant a little bit more work on the development side.
Option 2 was to find a way to store the secret key securely and allow everyone to use the same Nest “Product”.
APPLICATION WITH GOOD SECURITY
The biggest issue Nest had with our app was how we handled authentication. Granted we didn’t implement it in the best way possible, we still had hopes we could get by with what we had. We submitted to Nest over and over again and came back with our tails between our legs each time. This was the function I talked about earlier that we developed and re-developed time and time again. We hoped we were securing it enough from Nest’s perspective to let them give us a pass.
The reason this was difficult is because Nest uses a secret key in order to translate the one-time authorization code into a token as part of the OAuth 2.0 Authentication mechanism. As a result, that secret key needed to be protected, or anyone could start issuing API access tokens. We didn’t have a web server to dedicate to the Nest Add-on for Splunk that could handle requests, so we were stuck trying to code around that problem.
USING AWS LAMBDA TO SOLVE THE PROBLEM
One great benefit that AWS provided for us was the ability to store our client secret securely in the cloud, and have it be encrypted. This is actually the most secure method of this authorization/authentication process. We have the user authorize their account, their account re-directs to our Lambda function to exchange that code, and our function returns an access token to the user to be used with the Nest Add-on. It’s truly a thing of beauty, no matter how much we grumbled along the way.
This mechanism is a really great and mature method of doing authentication, although it did take me a little while to come to terms with how great it is. I was getting used to many other somewhat lazy IoT products that use much more simplistic API Keys. The LIFX Add-on for Splunk, for example, just requires you to go into a Web Interface and copy your API Key.
In the end, the flow below is what our authentication process looks like for the Nest Add-on.
SO, WHY WAS AWS LAMBDA SO COOL HERE?
The first reason came from being inspired to use this tool whilesitting at the Splunk Americas Partner Technical Symposium. I heard some bright minds like Roy Arson talking about the power of AWS and how you can run code without needing an entire server. You basically pay per execution of the code you write.
Now, of course, we could have done this the old fashioned way and it would have worked just as well. We would have started by standing up a server. We could have installed apache2 to handle web requests and written some code. Then we would have kept that VM running on a publically accessible IP somewhere. But at what cost? What if we’re just the lowly developer who wants to translate authorization codes into access tokens without the overhead and cost of hosting an entire web server? That’s where AWS really shines. Lambda was perfect for this use case.
Fortunately, for our use case, we have a very limited number of executions per month so we can utilize this service completely for free. On top of all that, the Lambda I wrote is only 20 lines of code.
I want to be clear, my Lambda is only 20 lines of code.
This doesn’t mean that AWS’s infrastructure is only 20 lines of code. There are plenty of more lines out there to keep this service running. There are systems, and backup systems, and all kinds of other overhead. But, two benefits are included: low cost (free) for Tim and I, and high availability for the public, who will be utilizing this authentication mechanism when they use the Nest Add-on for Splunk.
From a security perspective, we are doing much better too. We no longer need to distribute an app with a secret key in order to give our users a good authentication experience. This makes everyone involved in the process much happier, especially Nest.
We went from doing something terrible like distributing the client secret for our Nest “Product” with our Add-on, to transmitting an authorization code directly to Lambda over HTTPS and allowing our Lambda to do the exchange of authorization code for access token (also over HTTPS) between AWS and Nest. Trying to get in the middle of AWS and Nest will arguably be much more difficult than between a client setting up our Add-on while sitting in Starbucks.
Inside of our Lambda, we were able to have the client secret stored encrypted using KMS. So, the only place that the client secret is sitting in plain text is in the Nest Developer Portal. The beauty of KMS in this case, is we’re really doing everything we can to protect this key. We can disable our key, rotate it, and define a strict ACL for which functions are allowed to utilize it. If we were to disable our key, this would basically stop the ability for anyone to decrypt that client secret immediately in our Lambda.
Last but not least, if someone was to obtain the client secret - we have two protections to limit how bad the damage is. For starters, even if we get the user limit increased on our product, it won’t be unlimited. So the bleeding can only be so bad combined with Nest’s rate limiting for using their API. Second, we’d be able to contact Nest to say, “Hey, we believe there is an issue here and that secret is public. Please disable”. Of course, Nest hopefully is also using some kind of monitoring to detect any type of harmful increase in activity. If they’re not, I’d recommend they use Splunk.
THANK YOU AWS LAMBDA
I’m happy to say the problem has been solved and I think my partner in crime on this project, Tim Baldwin, is just as happy to see this problem behind us. Thank you to AWS Lambda for solving quite a large problem with little effort.