The SysAdmin Network

No more hiding in the server room

In the “Walk away with a Slingbox!!!! Read on…” thread the subject of OOB management came up. I wanted to expand on that so I’ve started this thread so as to not threadjack the other discussion.

Quick definition of OOB (Out Of Band) management: OOB involves the use of a dedicated management channel for device maintenance and control that is independent of the current status of the device. An example of this would be a server that has a embedded service processor (SP) that allows you to control the server regardless of wither the server is turned on, bluescreen’d/panicked, operating normally, etc…

So currently almost all of our gear is on the OOB network but the OOB network is just a VLAN on the production network so it’s not resilient to network failure and there’s no security between the OOB network and the production network. I’d like to rebuild that network so as to make it more reliable and more secure.

Most of our SPs have a serial interface in addition to the network one. So I’m thinking about using some sort of serial access server to complement the network. Then if the network goes down I have a redundant way to access the OOB SPs. Redundancy is important because I don’t actually use the OOB unless things are already broken in someway. I’d also like to add a second internet connection of some sort as well.

For security I’m thinking about throwing in a firewall between the OOB network and production and making people bring up a VPN to access it. Or perhaps we could put the Nagios box in the OOB network and use it as a sort of a bastion host between the OOB and production networks.

I’m interested in what other people are doing for OOB and remote management in general. Do you use KVM over IP devices or are you using “lights out” service processors? Do you like them? What about power control? How about switches and routers? What are you doing for those? Do you keep OOB traffic and access segregated like I’m suggesting or do you have it all on one network? Why or why not?

I’ve included some screenshots of the functionality offered by the SPs on my servers. It’s not super impressive or the best or anything but it works well. I’m interested in what else is out there so I’d love to see what you’re using and hear your opinions of what you have.




This is the SP on one of our servers. The SP is a separate daughterboard with it’s own NIC (blue box) and serial interface (red box). The SP is completely independent of the mainboard and is powered on if either of the power supplies is plugged in. I’d like to wire up the serial interface as well as the NIC for backup access.



This is a screenshot of the power control through the web interface, not much else to say. The SP is also accessible over SSH and IPMI but I doubt anyone wants to see pictures of a cli.




View of the temp sensors. We get emailed if it goes out of range. You can get this data with SNMP as well.



View of the Remote KVM app. When your using it you can definitely tell that it's remote; it has a slight lag similar to VNC screenscraping. But it's perfectly usable and, most importantly, is independent of the OS so you can work with the BIOS or install an OS remotely. You can redirect your local cdrom/floppy to the server or map an iso image to the server. It's pretty awesome to install an OS with an iso on your laptop to a server hundreds of miles away.

So that's some of what I've got. If you have iLO or the like then this is all pretty similar I'm sure. What do you have?

Views: 49

Reply to This

Replies to This Discussion

When deciding between using serial console management vs. ILOM (Integrated Lights Out Management) you should weigh the advantages of each. Serial console connectivity is truly "out of band access". That is to say, you won't be connecting to the same fan-out switch as the Ethernet ports on your box. Another advantage is reduced network overhead. Managing a serial console device such as a Cyclades/Avocent along with 'Conserver' software is perhaps a bit less overhead then DHCP over ethernet. Though the difference is small enough that it only really maters in large environments. The advantages of ILOM cannot be ignored however. Vendors such as Sun provide added value items such as page-alerts, logging and web interface connectivity.

I recommend using both the ILOM (Integrated Lights Out Management) as well as a serial console connection for production environments. As you suggested, relying on the ILOM alone leaves you vulnerable to network device failure. Of course, the serial management device could also fail, but at least you don't have a single point of failure. Such a setup is indeed a lot more work. But if you are striving to achieve five-nines for your services, its well worth the effort.

Phillip Pacheco
I agree with you that serial is the most out of band method. My view is that access to ILOM via serial or network shouldn’t be a choice, rather you should have both. Like you said it’s very important to eliminate single points of failure.

Like I said in my post I don’t use this for routine work so assuming the network will be up is a bad move because a network failure may very well may be the reason I need to use the ILOM in the first place.

RSS

© 2012   Created by Elizabeth Ayer and Michael Francis.   Powered by .

Badges  |  Report an Issue  |  Terms of Service