The SysAdmin Network

No more hiding in the server room

Hi!

 

 This is my first post on this site. I have been actively searching for a Forum such as this for some time as I have a number issues I would like to discuss with Sys Admin minded people rather than Developers. I as you will suspect I am a Sys Admin. I work in a reasonably large educational establishment in Scotland(Alba go Bragh!). My department is split into five sections: Help Desk, Central Operations, Development Teams, Site technicians and last but not least Management. The development teams are continually building new servers and services which are handed over to my team Central Operations to support. Currently we have 200 Windows based servers and a selection of Linux based servers. My problem is that Management haven't given any real consideration to the logistics of Update Managment probably becuase Operations have ensured that security updates have been applied appropriately. We currently manually apply security updates but this is now proving too difficult to manage due to the number of servers, clustered services and having to perform the update out of core hours which is taking longer and longer. I have rasied my concerns and to be fair it was listened to although I am not sure if it was fully appreciated. I have written a document covering the issues I face and given examples of Update Management solutions and how they could be used to improve this service and presented it to Mangement. However due to the effect of the Global Recession I doubt there will be any real terms funding to address my problems.

 

 Where I am going with this posting is what do other people do in the field of Update Managment? Do you share my issues and if so what attempts have you taken to address them? Do you use an Update Management solution and how do you use it? As an aside what type of relationship do you have with your development teams? Our development teams drive the department and are constantly talking down to other teams.The best way to describe them is that they will tell a person who can't afford a bus to take a taxi. As you can imagine this creates some friction. I guess I would like to know about the realtionship between your Operations and Development teams.

 

Thanks for listening and I apologise for the length of this posting but I really need to talk to people with these issues.

Views: 5

Reply to This

Replies to This Discussion

As lead tech / manager of a tiny in house team we're responsible for all of those sections covering our 60 - 70 servers so I really do sympathise with you. However keep Windows machines up to date is not if you use WSUS which is nicely free from MS.

I would suggest building a WSUS server (or many depending on you sites and their links) and set the update type via GPO. However I wouldn't recommend that you set the servers to automatically install the updates as soon as they arrive or you may run into issues.

We split our machines into two categories - PC's and Servers. The PC's get the updates over lunch time as soon as they arrive from MS however we keep an eye out for any troublesome ones to block. The Servers are set to automatically download but not install. We then undertake a manual review of the updates issued from the WSUS server for any known bad ones and install them en masse on the last weekend of the month.

Unfortunately for us this is getting to be a bit of an extra load too as it means working late Friday and a good few hours over the weekend but it's worth it rather than having unknown updates sneak on through.
We use WSUS to manage our PC's but not our servers as we didn't find WSUS allowed us to specifically dictate when we applied the updates. This is a major requriement for us as we want to ensure updates are applied to servers at specific times so we can manage restarting them. Ideally we would like to apply the patches to a group of servers at say 17:00 and restart them at our leisure later that evening.

Update Management has also lead us to consider reviewing our Asset Mangement system as solutions such as Languard and Shavlik NetProtect provide huge benefits in these areas.

I was interested to hear that you too have to patch out of core hours. This is dedication that doesn't get recognised too often. There are many people who would prefer to do this rather than rock the boat only to lose their role/job to an out-sourced company.
You've got to remember that your update settings are set in a GPO not by WSUS itself. We've set ours at the site level to catch the PCs to the local WSUS server and configured the automatic update to "4 - auto download and schedule the install" with install day set to every day and install time set to 13:00 which is generally lunch time for everyone.

This works great for the PCs but for the servers we like to review the updates first in case of known troublesome updates or even ones that take multiple reboots and / or long install times.

For the servers we've simply set the auto update level to "3 - Auto download and notify for install". This way we've got the updates on the servers waiting for us to review and allow / disallow the installation as required. You can set this to auto-install at say 18:00 and reboot (automatically or manually) as needed but I much prefer to have a quick review before kicking them off.

At the end of the day our job is to support the workers so that means giving them as much up time on the systems as possible and patching / rebooting around when they're using the systems is just one part of that. As to the dedication we're pretty darn dedicated at our place and regularly pull 200 hrs months, weekend and out of hours work as the job requires. It's just part of the job really so the company makes it worth out while in the end. As I said, we're IT support so our job is to not interfere with the users as much as possible. A lot of IT people seem to forget that I've found of late.

Languard is a good option but I've not used the other one so can't comment. If money is tight (when isn't it, huh?) I would say that WSUS does the job very well. You do have to look after it a bit with regular clean ups and checking but in the end the price is right IMO.
We use a product called Altiris to manage software and updates/patches to our desktop computers. Our Global team usually deploys the non emergency patches to a test group a week before they launch to the rest of the Enterprise.

What it comes down to for our Enterprise is that the patches get tested twice, and delayed on the average two to three weeks after MS release.

I am of the school that calls for testing before en-masse deployment of anything, patches or software.

If I were you, I would draw up a project plan to use WSUS. Its free from Microsoft, easy to deploy, and you can use Global Policies to administrate the type and frequency of the updates.
We use WSUS to manage Windows updates. Workstations get them at night and reboot automatically. We don't do any "formal" testing, but they get deployed to a small group of workstations before getting pushed to the whole company.

Servers download any patches automatically but we apply them manually (like Graycat). We stagger them, so if any problems creep up, we can deal with it or at least put that patch on hold. Sadly, we have a number of business apps that have to be shut down and brought back up in a certain order, so it's still a fairly manual process.

External servers get done first (anything in the DMZ). Then non-critical internal servers, gradually moving up to the more important/finicky servers.

Anything that would affect end users or customers gets done after hours. (Basically, it sucks to be on call for Patch Tuesday - that's a long week. But we rotate through it, and, well, it's part of the job.)

One thing that's helped quite a bit (IMO) is to get everything written down. We've got a checklist for every server that says "do this, do this, do this, check that these services are up, log in to this app and test this" etc. For some servers it's blissfully short (check one service after reboot), and for the complicated ones at least you have a checklist to follow.
My thanks to all who replied to my questions. It appears I am not alone and we are all doing pretty much the same thing when it comes to Update Management. I will feed your responses back to the project team for consideration.
There are several things to consider here. Lots of good answers on this thread on the technical solutions so I am going to pick up on the non-technical aspects. Specifically, relationships, procedures, and management. All three are related.

You say that "development teams are continually building new servers and services which are handed over to ... Central Operations to support." I think this sentence forms part of the foundation for the problems you describe. Are the servers being built to a standard? Do they arrive fully documented? Are the applications on those servers fully documented? Is there a handover process in place to make sure that all of your requirements are met before you accept the servers into support?

If any of these things are missing then I would strongly recommend fixing that. I guarantee that you will get complaints about red tape and paper work. And I guarantee that if you ignore them and make the procedure happen, it is going to help some of these problems go away.

You also say that "[o]ur development teams drive the department and are constantly talking down to other teams." I'm guessing here that you mean software development. Unfortunately, in my experience this is quite common. It shows a lack of understanding for the fact that although your skills lie in other areas, and that they are just as deep and important as their 733t coding skillz. The way I have handled this in the past is to ignore it. We are professionals who are here to do a professional job. If somebody wants to bitch and snipe, let them. Rise above it and be the better techie :-)

With the Management Team, I think that the best that you can do is to start a paper trail. Document all of your suggestions along with benefits of doing them, and the risks of not doing them. Send that on to the Management team including a polite request for a response by a reasonable date. (About one or two weeks from sending it.) Then it either happens, in which case good, or it doesn't happen, in which case if any of the risks that you highlighted occur you have lots of reason to deploy your solution.

Unfortunately, one thing you cannot fix is the out of hours work. It's part of what we do, and as much as it sucks to be patching servers on Friday nights and weekends, it goes with the territory. Improving some of these other areas will help reduce the amount of time needed to do this work.

Hope that helps in some way.
Andy

Thanks for the reply. I it good to hear how common problems are dealt with.

We have been trying to implement a better handover process involving improved documentation. To this end I have been looking at MOF to see what it might bring to the team. Iniital feelings are that the Development teams do not like the idea of filling in documents and would prefer a Configuration Management Database. It is felt that OPS are asking for too many documents to be completed. I accept that this might be the case as we try out new ways to do things. Unfortunately written documentation is a necessity no matter what the Dev guys think. One example of the problems I have faced in trying out MOF involves the Operational Health Review. I have looked after a service for just over tweleve months when the first big change to the service was implemented it involved both a software change and a change to the teams who provide support for this service. I asked the Signateries to the original Handover(Operational Guide) document if they could review the document in light of the changes I made to it to reflect the changes in the service. I explained that in order for the document to remain relevant changes should be agreed. This didn't go down well at which point the Configuration DB was mentioned. I felt that the Dev Guys were missing the point of the OHR which isn't simply to reconfirm service baselines. I now find myself wondering how I am going to move forward when the very people (Dev Team) who recomend MOF aren't prepared to embrace it if it affects them. Another example is Delegated Admin Permissions where the Deve Team involed in building servers and installing applications don't think it should apply to them.

Out of hours work I have to accept is an occupational hazard. My problem is that Mangement don't understand that it isn't efficient to do things manually when there are over 200 servers. Work may stil lbe required out of hours but the ammout of Flexi-Time/Overtime will be reduced if the process is mangaged better with appropriate tools.

RSS

© 2012   Created by Elizabeth Ayer and Michael Francis.   Powered by .

Badges  |  Report an Issue  |  Terms of Service