Navigating the Cloud Journey

Episode 1: Visibility Strategies for Hybrid Cloud Environments

June 23, 2021 Michael Valladao Episode 1
Navigating the Cloud Journey
Episode 1: Visibility Strategies for Hybrid Cloud Environments
Show Notes Transcript Chapter Markers

In our first episode, Mike is joined by Ethan Banks networking expert, podcaster, blogger and co-founder of Packet Pushers Interactive. 

First, they will discuss what Hybrid Cloud means and then jump into a  lively conversation about important considerations for cloud adoption,  who's responsible for security,  adopting  new tech and processes,  organizational changes, gotchas, and much more.

Enjoy!


Listen to other Navigating the Cloud Journey episodes here.

Mike Valladao: Welcome to the first episode of Navigating The Cloud Journey podcast series, I’m your host Mike Valladao, and today we discuss visibility strategies for hybrid cloud environments.

Mike Valladao: I’m delighted to have Ethan banks is our guest today Ethan architected networks for more than 20 years before he co-founded Packet Pushers Interactive LLC.

Mike Valladao: Today he is a renowned author blogger, and an industry consultant. His company is responsible for producing content for multiple shows, including Heavy Networking and Day 2 Cloud.

Mike Valladao: So, we're going to jump right into content here.  Ethan, I keep hearing that hybrid cloud has become the de facto standard. Is that true and, if so, what does it mean? Is there any way we could perhaps define some of these terms?

Ethan Banks: So, if we define hybrid cloud as you've got public cloud and you've got stuff in Amazon web services and Azure and wherever else you've got stuff. And then you've got some kind of on-premises gear because you do. That’s a rough definition to me of hybrid cloud.

Ethan Banks: There are few shops that have got everything migrated off AWS. Everybody's got something on Prem because reasons.

Ethan Banks: Now, I guess, we can argue Mike about whether or not, if you have a cloudy on premises infrastructure that is self-service and API driven infrastructure is code. Whether or not, that qualifies because a lot of times people that are doing on Prem stuff it's kind of their legacy environment, but you know hybrid cloud is a big umbrella, I guess, so maybe we just got to keep it; there's stuff on Prem Mike and stuff in the public cloud too.

Mike Valladao: That's correct, and, in fact, you could even talk about multi cloud, because sometimes somebody will have a combination of multiple cloud environments and even maybe a private cloud as well, so for our conversation let's keep it broad, okay? So, would you, would you agree that it tends to be a de facto standard and why.

Ethan Banks: it's a de facto standard because that's what businesses need. Okay I’m gonna qualify that. That's what businesses need or sometimes that's our businesses have.

Ethan Banks: Shadow IT has driven a lot of public cloud adoption, of course where we're past but not entirely past the shadow IT thing, but I think a lot of shops have done their best to bring the public cloud presences that they have into their standard governance.

Mike Valladao: I’m going to quote you here. You said at one time that “The cloud is something that happens when IT is doing other things.” Is that true?

Ethan Banks: Well of course, that's the tagline for the Day 2 podcast and that's the joke right, you know shadow IT has created all kinds of cloud adoption. And I think that, because of that, though we're aware of that that's an issue and companies and businesses have done their best to bring that all in house get it under standard governance. And yet we can't just pick up everything we've got in our data centers and move it to cloud and there's a lot of reasons for that. Some applications do not do well in cloud. Some companies have tried this where there's a lift and shift approach. They pick it up, it was a VM let's say running on VMware and they move it to an Amazon EC2 instance and it's really expensive to pick it up and move it and drop it into an EC2 instance that got oversized because you got to have it be a big instance, right? You never know how much RAM or CPU you're going to need and so you size it big, and you pay dearly for it. The business gets the bill… “Wow we shouldn't have done that”, and then they end up maybe repatriating that workload.  And so businesses are getting smarter about this now where…All right, we know you can't just pick it up and move it, that's not the right approach for a lot of applications, so they're in a situation of “I’m going to keep it in house” because eventually I’m going to replace it with something else… what's the point of moving it to cloud and so they've got on Prem for that reason.

Ethan Banks: Or they're in a project where they're going to refactor that application let's say they're going to turn it into something that can work well in a cloud native environment, but that is time consuming and potentially expensive to make that happen and so that App is not going to move to cloud today. It might be in a year or three, but not today. And so, this hybrid cloud environment is just here for very practical reasons.

Mike Valladao: Right and the fact is, we still have to worry about security, we still have to worry about analytics. Who's responsible for all these things?

Ethan Banks: This is a difficult question. So, on the one hand you've still got operations folks that are that are thinking about a lot of these things.

Ethan Banks: Security such a big question, it is complicated. Who do you make responsible for this? When you begin to deploy applications into public cloud, the security model is somewhat different and more nuanced. And maybe just different ways that you go about securing your VPC and your various PaaS services you might be consuming; versus what you're used to in-house. So, who does that? Is it the the ops person who maybe was responsible for standing that up? Is it some kind of a secops team? Is it a governance team?

Ethan Banks: I mean we could go on about this Mike, but I don't I don't think the paradigm changes particularly just because we've introduced public cloud into the mix as far as who's responsible for security.

Ethan Banks: I think what changes is just bringing in a whole new environment and a different way of thinking and making sure that there is a model that as developers consume public cloud, they are fitting into an existing and defined security strategy. The rules have been set up, ideally, it's been automated for them when those workloads get deployed. I don't know that we see that, in reality all that often, but that's ideally what we'd like to see.

Mike Valladao: Exactly! Because with a shared environment here, and shared, you know who actually owns these things, it does make it more difficult. Because one size doesn't fit all for everybody in cloud. And, as a result, we've kind of got to navigate the journey ourselves and try to figure out what makes the most sense in which environments.

Mike Valladao: And along that same line; if you're going to be trying to do the security, do the analytics, trying to make sure everything works out there, where does visibility really come into all this? Why is that important?

Ethan Banks: Well, so let's define visibility Mike. So, I’m going to take a step back for a second here and say when you're talking about cloud environments and operating applications in a cloud like infrastructure, maybe it's running on Kubernetes. Observability is the magic word. There's visibility, but then there's observability, and that's I think, referring to something that subtly different. So, observability where higher up the stack it's a big big buzzword, there are alot of products that are in the “observability space. But you're looking at how applications interconnect; you're looking at say a microservices architected application where there's a lot of calls going back and forth across the network to fulfill the transaction request that's come through. How is that being done? What does that look like? And you have tools that do distributed tracing let's say.

 Ethan Banks: All right, that's one kind of thing. There's a lot of telemetry that can come out of that world so you can get a sense of exactly what the app is doing, how the different parts of the application are working together different tiers if it's that sort of an architected app. But if we get down to visibility Mike, I think you know that this is a Gigamon podcast…I think you and I have from a networking perspective, a different thought about what visibility is. We're trying to get down to the packet level, am I right?

Mike Valladao: Whenever we can, the packets are often your best source of information. However, it could be higher level as well, it could also be things with metadata. So, there's lots of different options out here.

Mike Valladao: We're just trying to define what we mean by visibility. And for me, visibility means being able to see what's going on. To be able to get rid of the black holes, if you will, because sometimes things fall into a position that we don't know what's going on, and maybe want to use it for…obviously, for security purposes, but also sometimes for troubleshooting.  That's a big deal, and it makes it difficult in the cloud.

Ethan Banks: So, by we, were getting that network engineering the network ops folks that are being blamed for the app performing poorly.

Mike Valladao: Good point absolutely! The “we” is typically the people that are either the secops or the networking folks because they're the ones that get the blame when something goes wrong.

Ethan Banks: Yes, and so in that case you know “visibility”, it means what it's meant to us all these years. We want to be able to grab packets off the wire if we can. And we don't necessarily want all the packets practically speaking.  We can't get all the packets, the wires are too fast or pushing too much data to capture it all, but we want the ability to know at any point on the network, what data is flowing between two points and see those packets, the packets don't lie and figure it out from there.

Ethan Banks: Although Mike I just I got to point out, if we get to this point and troubleshoot it was like oh man I gotta break out the sniffer and start looking at what's going on; your day is not going well!

Mike Valladao: you're absolutely right here because we don't want to have to get to that point, but we do want the information to be able to keep us from getting to that point.

Mike Valladao: So, are there good ways to maybe leverage similar security across the hybrid environment…what are your thoughts there? How do you go about getting the information and moving it around?  Give us some thoughts here what are the best practices?
Ethan Banks: So, it sounds like we're talking similar security in the context of visibility Mike yeah?

Mike Valladao: Yes, I think so.

Ethan Banks: In our shops, in the data centers that we owned, we've built out visibility fabrics. That's the thing that we've done. And we've been able to plumb in to anywhere in the network and pull bits off the wire using span ports on a switch let's say something like that or, you know, maybe some agents here and there, but the big thing was we owned the network, we owned the physical switches and the physical hardware, and so we could plug in. For example, I need this data that's on the network copied over to this span port and then we send it off to something like a Gigamon box and then off to a tool for analysis.

Mike Valladao: And it could be anything from Bro, Suricata, whatever you happen to have. It could be Palo Alto networks. We don't really care what the tools are, we just want to be able to feed them properly.

Ethan Banks: In the cloud we don't have any switches anymore.

Mike Valladao: No switches! What happened?

Ethan Banks: So that really introduces the new paradigm here, where we need to be thinking well it's not new but how do we solve the problem if we don't own the switches, but we still want the packets? Where do we tap-in so that we can see that stuff? And there's a bunch of answers to that question Mike, several different things that we can get into.

Mike Valladao: Let's take AWS as an example. How would one be able to get some of the data?

Ethan Banks: So, Amazon offers a service called VPC traffic mirroring where on an EC2 instance, you can say, I want to mirror traffic coming off of this EC2 interface and send it somewhere. Amazon will tunnel it for you and send it to where you want it to go.

Ethan Banks: They also offer that on a network load balancer, so if you got a network load balancer sitting out in front of your application, you can do it there.

Ethan Banks: And when you do that, what you're doing is relying on Amazon as you kick off this service to pull the packets off of the wire, but its amazon's wire encapsulated for you to tunnel it across your environment and then send it to where it needs to go.

Ethan Banks: what's interesting about this, though that's different is that in our data centers, when we own this ourselves, we never thought about things like eagerness charges, because you know, we're not paying any extra, but you are in the Amazon environment, so you have to think about where you're shipping this stuff to. Are you shipping these packets, using the VPC traffic mirroring service offsite? Or to somewhere in the cloud for processing?

Ethan Banks: Actually, I got a quote here from some of the amazon's documentation. They point out that mirrored traffic counts towards instance bandwidth. So, if you've got a gig of inbound traffic and a gig of outbound traffic and you're mirroring it you're going to end up with three gig of outbound traffic totally; a gig for the outbound a gig for the mirrored inbound and the gig for the mirror and outbound traffic.

Mike Valladao: You're saying that adds up actually.

Ethan Banks: It's kind of one of those obvious like “oh yeah I mean wow okay, I gotta really think about this!” To make sure that what you're probably want to do is not be shipping it off site unless it's an occasional thing that you do once in a great while and you don't mind that occasional egress charge.

Ethan Banks: Maybe you want to be shipping it to some instance inside the cloud for processing and then maybe ship metadata or something offsite; something like that.

Mike Valladao: As an example, you might have a VPC that set up as a tool stack. And you could have your tools right there, that's one way. There are different ways that you can do things here, and because of the amount of money that it costs to ship the packets back and forth, is simply the fact that you're better off doing you're filtering ahead of time. Don't do it after you ship it back on prem, the heavy duty lifting up front in the cloud, because in the cloud you'd be able to do you're filtering and say, for example, I want to maybe do some packet slicing, maybe I only want certain types of packets, etc. Like you said, don't make it so that one plus one plus one equals three, just send what you need to. Be smart about it.

Ethan Banks: I’m thinking well, of course everybody does that, but then of course, we hear the horror stories about the surprise bills that people get. So maybe not everybody thinks about this. But yeah exactly; you want to do that filtering right up front and minimize those fancy charges.

Ethan Banks: And Mike, most of the clouds, as I was digging around offer something like this Amazon VPC traffic mirroring service. Google cloud has their VPC packet mirroring service they call, which sounds very similar and has some of the same caveats that the Amazon Service offers. Azure is an interesting one. They have a service called virtual network TAP, but it's a “preview feature” not supported currently and that there's a doc I found dating back to  2019 where they say yeah it's a thing, and if you read through it, it sounds like the Google and the Amazon Doc but again, it's a preview feature not available to everyone, and I checked with one of my contacts and Azure and he confirmed that's where it's at. Not really a live thing now in Azure it says, hey go to the packet broker folks, which was interesting. It's like okay so you've got these cloud native services that can do packet mirroring for you, but if you can't get done what you need depending on what cloud you're in and what functionality you're looking for then, now you're into a traditional packet broker again like we've had for years only redesigned in the image of the cloud, so you can still get your packets off the wire in the cloud, and I know this is where it gets him on lives and as a bunch of a whole bunch of vendors that live in this space now.

Mike Valladao: I’d like to make a plug. I don't normally want to make plugs for Gigamon, but Gigamon helped Azure in putting that together. So that future technology which you said is very similar to the VPC mirroring that other clouds have. That's something that Gigamon, has had a hand in working on, because of course it's very important to us, as well as the industry to be able to have that kind of capability in the future.

Ethan Banks: So, the azure network tap virtual network tap is what you worked on??

Mike Valladao: Actually, it’s their mirroring capabilities that have yet to be debuted. They are working with Gigamon. So that's just a little bit of insight for you here. So anyway, there's other ways of getting the traffic as well, of course, so let's say that let's say that you've got virtual switches out there. There’re also things such as ERSPAN and things of that nature. So, there are ways of getting some.

Ethan Banks: Well, I’d forgotten about that and it's like such a replica of what we do onsite to do things like okay I got a Cisco CSR  1000V up there and yeah you can do ERSPAN off of that but it's like, I really think one of the errors that we make as network engineers moving into cloud is trying to make cloud behave like my on-premises stuff behave. Let that go that's not how that's supposed to work anymore. Get your head around how the cloud wants you to use networking; what the limitations are what the boundaries are and how they want you to do things and it's confining. Which is ironically also liberating because it's like wow I don't have ways to do this, I have like one way to do this, or maybe two ways to do this. But so often our go to is, I want to do it the way I’m familiar with now and just replicated in the cloud.

Ethan Banks: But that can be slow, it can be inefficient, it can be expensive and to unlearn what we've learned from on premises data center work our whole careers.

Ethan Banks: We have an opportunity now to learn something new, some fresh technology. And yeah, there's a lot of things that are saying we don't have to relearn it all from scratch, but to me if your solution to packet capture is throw up us CSR 1000V and do ERSPAN; you're not taking an opportunity to do things in a new way maybe it's the most effective way as a short-term solution, maybe it's the easiest thing to do. But to me it's like falling back on what you know when there's probably a better, more cloud native or at least cloud friendly way to do such things.

Mike Valladao: Well, I don't disagree with you at all here. So, let's go into even some of the more uglier components that are out there in the cloud. There's a whole bunch of different pieces out there and what about things like agents, you know, I know, everybody loves agent right? Everybody wants an agent out there. You just want one more agent to make everything work better, for you.

Ethan Banks: I mean well, I mean you're saying that tongue firmly in your cheek Mike, I can see you on video here, but you're both biting your tongue off. But the agent thing architecturally solves a certain problem right, because then it becomes sort of, you don't care where the workload is. It doesn't even know there's public cloud there necessarily you can deploy the agent and you can see the agent wherever it lives. You can do the thing, whatever it is that the agent is called upon to do.

Ethan Banks: No, we all hate agents and it's annoying, and we got to standardize them in our builds and you know, put them up, and they are what they add you know weight and potentially latency, and you know all kinds of things, but architecturally, they're the kind of nice from that perspective, just for ease of deployment. But you did preface this with the uglier!

Mike Valladao: Absolutely, where this is for practitioners, as we’ve talked about so the concept is, we want to talk about these things, because if you understand what your options are then you can make the right decisions and that's where I think we're really going here. Because there are a lot of options and what happens is a lot of people that are used to being in on prem, like you said, they're used to doing things a certain way.

Mike Valladao: That has to shift, that has to evolve, and it will evolve both with the cloud companies themselves, as well as the way that we do business ourselves too. Because you have to move.

Ethan Banks: You make you make a point here though, because it's not just about the consumers of these technologies that need to move along it's also the vendors. It's also the public cloud providers. I mean everybody is having to rethink about how some of these things are done, I think the early days of public cloud, we were in a situation where public cloud was designed more with hyperscale in mind and more automated procedures in mind and a more typical sort of workload in mind. When the enterprise came along and began consuming cloud more and more and more and brought their “not hyperscale” or kind of problems where you needed more insights.

Ethan Banks: Going back to the hyperscaler for example, if you have something that's slow and you have enough telemetry to know that that container running that copy of the workload is just a dog, what do you do you kill it you delete it you restart it, you move on. You don't troubleshoot it you don't think about it, oh something's got malware on it…kill it, restart with the golden image, go. You solve the problems that way. Now I think enterprise could benefit from that sort of thinking.

Ethan Banks: At the same time, a lot of the applications are designed to be able to work that way. It's still a model that of some sort. It doesn't scale out where you can just kill and restart instances.

Ethan Banks: And so, cloud too, and the vendors that have supported the enterprise and are now supporting enterprises that are moving into cloud have had to evolve the tools as well. So, there's a lot of things that are going on here that make this complicated. It's not like “if you just figure out the cloud native thing, you'd be fine”. It's not so simple as that, because not all the tooling is even there to do exactly what we maybe need to do.

Ethan Banks: And so now it's getting to we've got a lot of minimum viable product and better, we've got a lot of I think kind of 2.0 of these sorts of solutions that are out there.

Ethan Banks: And that we're learning how to use them and, but now it's like okay what else do we really need to make these tools, efficient and insightful and inexpensive from a cloud perspective and user perspective as possible?

Mike Valladao: Yep. And I agree with you that the whole reason why people move to cloud is because it's easy to spin things up, it's easy to take care of the scalability, you don't have to wait three months to order the gear and then build it out; it happens, really, really fast.

Mike Valladao: The drawback of course is, we still have to secure it, we still have to maintain it, we still have to do all these things. And we're just talking through different ways of possibly getting that accomplished.

Mike Valladao: So, any other best practices or possible gotchas that come to mind? Anything else we should mention here?

Ethan Banks: Well if best practices were talking around you know security. To me, the biggest thing is to get all the right folks engaged as you're beginning to move these workloads up the cloud. You have a security model. You have a governance model…data governance and so on. Okay, before you move to cloud, that conversation needs to be had that is get all the folks around the table, discuss how you're going to take your long years standard set of processes, procedures, rules, governance and then map that into the cloud. It will open up a very large discussion.

Ethan Banks: We talked earlier Mike about well who is responsible for security, and I don't know that we came up with a clear answer because I don't think there's a clear answer to give because it depends so heavily on your organization, how you're using the cloud what rules the different applications fall under and so forth. Which underscores the point here; you've got to have everyone together in agreement on what's happening and then establishing what those security rules are going to be. And that's in two tiers. On the one tier it is kind of the rules, the governance model, what the security models look like and then, on the other, it is who the operators are that are going to bring that security to life and, as I mentioned earlier, ideally, in an automated way where you've established this stuff. What these rules are going to look like, what these profiles are going to look like. As you bring a workload into existence, those security profiles are automatically applied so that devs don't have to think about it because if there's someone that's not responsible for security, I mean, at least at an infrastructure level, it shouldn't be the devs; they don't want to have to think about that that should just be there for them, let them worry about application-level security.

Mike Valladao: I’m so happy that you mention this, and the reason why I say that is because what I’m seeing out there in the wild is very often a lot of the people on the cloud teams aren't talking with their traditional networking and secops folks. It almost seems to be a throwback to the days when networking people didn't talk to their security people.

Mike Valladao: Well, they finally got so that they're doing the same type things now. Whereas now it almost seems like the one that is often on the side there's this “special team” that's dealing with the cloud, and when they do that, they're not always doing all that talking and communicating you're talking about.

Ethan Banks: Well, you're almost you're almost want to make me get my desk and yell because haven't we busted the silos yet? Haven't we figured it out? IT as an organization cannot be aligned along technology silos. It's broken. There’re too many dependencies cross technology, you have to build the stack together as a group. This is one of those things that really winds me up, because I see this too Miki, I see the “oh we know about AWS and we're going to put you in this corner and when we need something stood up an Amazon we're going  send you a ticket and you're going to stand up the thing”; and that poor soul is just they're trying to figure out to the best of their knowledge, how to do it. And we got all these security issues. Today, I heard about an unsecured S3 bucket, and how many years we've been hearing about that problem you know can't we get over that yet?

Ethan Banks: It's broken, because you can't rely on an individual to just figure it out, because they know something about the tech. All of the technology implementations must be guided centrally, and that best happens when everyone that is involved in an IT group, work together to figure out what this thing looks like from top to bottom.

Ethan Banks: it's hard to have those conversations. It is a difficult thing to have. You got to learn each other's languages and so on. Cloud adds yet another dimension to our silos but, again, we should not have silos anymore. Cloud to me shouldn't be yet another silo, it should be the thing that brings everyone together around the table! You triggered me Mike, you triggered me, sorry.

Mike Valladao: Obviously I did. And I’m so happy that we talked about that.

Mike Valladao: Ethan, thank you very much for helping us explore visibility in hybrid cloud environments today. Before I let you go, how can our listeners keep tabs on the activities that you're doing moving forward.

Ethan Banks: There's two places you can keep up with me. I am at Packet Pushers.net, that's where you will find me co-hosting a number of technology podcast for engineers. You can also find my personal blog and ethancbanks.com, where I write about tech, share opinion pieces, and so on, and on Twitter, I am at @ecbanks.

Mike Valladao: Great! That is a wrap of our first episode. If your folks out there have any questions or comments, please visit the Hybrid/Public Cloud collaboration group within the Gigamon Community, which of course is at community.gigamon.com.

Mike Valladao: Our cloud journey has begun hang on and enjoy the ride. Thanks folks!

Defining Hybrid Cloud
Cloud happens when IT is doing other things!
Who is responsible for Cloud security?
The importance of Cloud visibility
Network switches and TAPs in the Cloud
Deploying Agents in the Cloud