﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	resolution	keywords	cc
20	scripts LVS design issues	andersk		"(Imported from [https://help.mit.edu/Ticket/Display.html?id=431727 help.mit.edu #431727].)

  Now that Nagios doesn't suck, we can actually see the scripts outage caused by the AFS server restart every Sunday morning. This made me realize a few things:
  
  * Our fallback to hodge-podge isn't just an exceptional condition; it happens every week. Thus it's an even worse idea than I thought it was. Viewers will get confused, and search engines may remove pages from their indexes, if they happen to get a 404 error from hodge-podge at the wrong moment.
  
  * Since the heartbeat script is in the scripts locker, the AFS server that serves it (aegisthus) is a single point of failure. Ideally LVS would check multiple heartbeat scripts in lockers on several different AFS servers, and continue routing connections if any of them respond.
"	defect	closed	minor		web	fixed		
