*shudder* i hate almost losing a server

Open discussion about any topic, as long as you abide by the rules of course!
SOAPboy
Posts: 8268
Joined: Sun Apr 13, 2003 7:00 am

*shudder* i hate almost losing a server

Post by SOAPboy »

Some of you may know, Im a network admin for a casino.

Im on graves atm. Its a total snoozefest normally, but tonight, not so much.

First, i get a call about some computer crashing, no biggy. Cant be fixed, needs replaced, yadda yadda.

I get back to our server room, and our server that runs the entire casino floor is beeping, like crazy.

Panic mode.

Turns out, its just a HD going out. Company will be here tomorrow.


But jesus christ. Talk about scary shit.
AmIdYfReAk
Posts: 6926
Joined: Thu Feb 10, 2000 8:00 am

Post by AmIdYfReAk »

yea, i hate it when that happeneds...

is it mirroring or?
SOAPboy
Posts: 8268
Joined: Sun Apr 13, 2003 7:00 am

Post by SOAPboy »

Yeah mirroring. Its all good. Its just the beeping it makes is the same sound it would make if it was going to go down.

Its hooked up to our "outside" Company, so they knew about it. They just failed to mention it to me. :olo:
Nightshade
Posts: 17020
Joined: Fri Dec 01, 2000 8:00 am

Post by Nightshade »

Can't replace a HDD? SUPAR ADMIN!
Doombrain
Posts: 23227
Joined: Sat Aug 12, 2000 7:00 am

Post by Doombrain »

i think he's talking about a blade, not a hd?
SOAPboy
Posts: 8268
Joined: Sun Apr 13, 2003 7:00 am

Post by SOAPboy »

Doombrain wrote:i think he's talking about a blade, not a hd?
HD in one of the many racks of HDs, so yes technicly.
Its an odd setup. Not like normal rack servers, but similar.
Nightshade wrote:Can't replace a HDD? SUPAR ADMIN!
Theres a reason we pay the big money for these servers bud. So we dont have to stock spare HDs everywhere..

Its about time something finally went out on it tho. its not had a single reboot, nor a second of downtime in 3 years? prolly longer. Konami + Linux is an amazing beast. Fucking thing never goes down. HD goes out, big whoop, some dude shows up that day (or next morning in this case), plops another one in, and thats that.

Our windows servers on the other hand. Lmfao. Were lucky to see 1 month on those peices of shit.


Mind spelling errors and shit. Its 6am -_-
Nightshade
Posts: 17020
Joined: Fri Dec 01, 2000 8:00 am

Post by Nightshade »

Yeah, how's that having no spares thing working out for you?
+JuggerNaut+
Posts: 22175
Joined: Sun Oct 14, 2001 7:00 am

Post by +JuggerNaut+ »

nay0k wrote:Yeah, how's that being a gay retard working out for you?
not here.
SOAPboy
Posts: 8268
Joined: Sun Apr 13, 2003 7:00 am

Post by SOAPboy »

Nightshade wrote:Yeah, how's that having no spares thing working out for you?
Great.

We have spares for other servers.


You just need to understand the industry im in. Casinos arnt office buildings, and these servers arnt something "we" need to be fooling with. Hense, the hiring of companys to do it for us.

Its not "Bobs law firm" here. Its a fucking casino. >_<


Now lets assume for 1/2 a second, that it actually goes tits up.
I make 1 phone call, and someones here within the hour. Id be running damage control for all the morons freaking out on the floor :P
Grudge
Posts: 8587
Joined: Mon Jan 28, 2002 8:00 am

Post by Grudge »

great
Nightshade
Posts: 17020
Joined: Fri Dec 01, 2000 8:00 am

Post by Nightshade »

SOAPboy wrote:
Nightshade wrote:Yeah, how's that having no spares thing working out for you?
Great.

We have spares for other servers.


You just need to understand the industry im in. Casinos arnt office buildings, and these servers arnt something "we" need to be fooling with. Hense, the hiring of companys to do it for us.

Its not "Bobs law firm" here. Its a fucking casino. >_<


Now lets assume for 1/2 a second, that it actually goes tits up.
I make 1 phone call, and someones here within the hour. Id be running damage control for all the morons freaking out on the floor :P
I guess I'm not really understanding your position then. Are the servers your responsibility? Is that just that one not?
SOAPboy
Posts: 8268
Joined: Sun Apr 13, 2003 7:00 am

Post by SOAPboy »

Nightshade wrote:
SOAPboy wrote:
Nightshade wrote:Yeah, how's that having no spares thing working out for you?
Great.

We have spares for other servers.


You just need to understand the industry im in. Casinos arnt office buildings, and these servers arnt something "we" need to be fooling with. Hense, the hiring of companys to do it for us.

Its not "Bobs law firm" here. Its a fucking casino. >_<


Now lets assume for 1/2 a second, that it actually goes tits up.
I make 1 phone call, and someones here within the hour. Id be running damage control for all the morons freaking out on the floor :P
I guess I'm not really understanding your position then. Are the servers your responsibility? Is that just that one not?
All of them are, to an extent.

That "one" is just, well, something we dont generally fuck with due to its uptime, and its importance to the firm as a whole.

If we have to do shutdowns, then yes we have to screw with it.


All the other servers tho, im responsible for.
Qr7
Posts: 184
Joined: Mon Apr 09, 2001 7:00 am

Post by Qr7 »

why the fuck does the whole thing run on one machine. thats a terrible design.
Giraffe }{unter
Posts: 2941
Joined: Fri Mar 17, 2000 8:00 am

Post by Giraffe }{unter »

lol try being mid day working for a company with a 1 - 2 million a day shipping goal and having the entire data center shut down without notice.

On any normal day this would not be an issue. We have a 100KVA ups running our data center backed by 2 generators.

A generator is only as good as the person who maintains it. Turns out our Maint department forgot to lubricate the transfer switch. The generators were running but not feeding power. They tinkered with it for a while then came in and said you have about 5 minutes to shu....

poof lights go out servers go down "I guess it was less than 5"

Dammage to the UPS from being drained caused it to fail the following day when we took another hit taking down our data center yet again.

Talk about stress!

Nevermind
LawL
Posts: 18358
Joined: Wed Mar 01, 2006 5:49 am

Post by LawL »

I'm far too hetero for this thread. Carry on.
Fender
Posts: 5876
Joined: Sun Jan 14, 2001 8:00 am

Post by Fender »

Giraffe }{unter wrote:On any normal day this would not be an issue. We have a 100KVA ups running our data center backed by 2 generators.
Our entire building is powered by a massive flywheel. We have a rather large diesel engine that spins the flywheel if we lose power from the electric company. Our UPS are used to bridge the short gap between loss of power and the diesel start up.
SOAPboy
Posts: 8268
Joined: Sun Apr 13, 2003 7:00 am

Post by SOAPboy »

Giraffe }{unter wrote:lol try being mid day working for a company with a 1 - 2 million a day shipping goal and having the entire data center shut down without notice.

On any normal day this would not be an issue. We have a 100KVA ups running our data center backed by 2 generators.

A generator is only as good as the person who maintains it. Turns out our Maint department forgot to lubricate the transfer switch. The generators were running but not feeding power. They tinkered with it for a while then came in and said you have about 5 minutes to shu....

poof lights go out servers go down "I guess it was less than 5"

Dammage to the UPS from being drained caused it to fail the following day when we took another hit taking down our data center yet again.

Talk about stress!

Nevermind
lmfao i hear ya on UPS bullshit.
Some crazy ass ammount of money, for 15 min of extra uptime. And if the generator dont kick on? Casinos shut the fuck down. And god forbid our AC for the server room actually comes back up when we kick over to generator power :olo:


Qr7 wrote:why the fuck does the whole thing run on one machine. thats a terrible design.
You evidently dont understand how casino floors work.

1 server runs the slot machines, and any other "ticket" in and out setup. Example, Those silly virtual blackjack tables.

Its very simple. It makes sense, and there IS fail safe setups involved.

Theres a reason the thing runs 24 some odd hard drives for very little data.

Look into gaming systems. Its simple stuff. Now if we were a huge las vegas size casino, we wouldnt be on 1 server. And here in the very near future were doubling everything.
Giraffe }{unter
Posts: 2941
Joined: Fri Mar 17, 2000 8:00 am

Post by Giraffe }{unter »

Fender wrote:
Giraffe }{unter wrote:On any normal day this would not be an issue. We have a 100KVA ups running our data center backed by 2 generators.
Our entire building is powered by a massive flywheel. We have a rather large diesel engine that spins the flywheel if we lose power from the electric company. Our UPS are used to bridge the short gap between loss of power and the diesel start up.
That's the same situation our UPS was drained due to the generator not transferring once it came up to speed. On a normal day the UPS should take a 1 minute hit, then regulate the Generator feed if necessary.

We're switching to individual Rack UPS systems now, Each UPS will be able to power down it's servers in the event of generator faliure. 10 guys shutting down 70+ servers properly in 5 minutes is just not possible.
SOAPboy
Posts: 8268
Joined: Sun Apr 13, 2003 7:00 am

Post by SOAPboy »

Giraffe }{unter wrote:
Fender wrote:
Giraffe }{unter wrote:On any normal day this would not be an issue. We have a 100KVA ups running our data center backed by 2 generators.
Our entire building is powered by a massive flywheel. We have a rather large diesel engine that spins the flywheel if we lose power from the electric company. Our UPS are used to bridge the short gap between loss of power and the diesel start up.
That's the same situation our UPS was drained due to the generator not transferring once it came up to speed. On a normal day the UPS should take a 1 minute hit, then regulate the Generator feed if necessary.

We're switching to individual Rack UPS systems now, Each UPS will be able to power down it's servers in the event of generator faliure. 10 guys shutting down 70+ servers properly in 5 minutes is just not possible.
Thank christ were only running 8ish. Its not to bad to shut down those 1-3 manned in 15 min.

And i really would like to look into other UPS solutions. this big heap of metal in the center of our room is obnoxious.
^misantropia^
Posts: 4022
Joined: Sat Mar 12, 2005 6:24 pm

Post by ^misantropia^ »

Qr7 wrote:why the fuck does the whole thing run on one machine. thats a terrible design.
I didn't read the whole thread (yet) but to answer your post: heaps of applications don't scale beyond a single-machine setup. People* who think it's just a matter of hooking up a couple more servers are plain and demonstrably wrong.

* In my professional life this often equates to "managers" or "sales reps".
User avatar
Foo
Posts: 13840
Joined: Thu Aug 03, 2000 7:00 am
Location: New Zealand

Post by Foo »

Giraffe }{unter wrote:We're switching to individual Rack UPS systems now, Each UPS will be able to power down it's servers in the event of generator faliure. 10 guys shutting down 70+ servers properly in 5 minutes is just not possible.
We have 67 servers, 2 or 3 guys if we're lucky, a single unmanaged UPS with no alert generating system. Oh and no emergency downing plan.

We're so fucked.
User avatar
duffman91
Posts: 1278
Joined: Thu Jan 25, 2001 8:00 am

Re: *shudder* i hate almost losing a server

Post by duffman91 »

SOAPboy wrote:Some of you may know, Im a network admin for a casino.

Im on graves atm. Its a total snoozefest normally, but tonight, not so much.

First, i get a call about some computer crashing, no biggy. Cant be fixed, needs replaced, yadda yadda.

I get back to our server room, and our server that runs the entire casino floor is beeping, like crazy.

Panic mode.

Turns out, its just a HD going out. Company will be here tomorrow.


But jesus christ. Talk about scary shit.
What city and casino group do you work for that has such a weak infrastructure for a casino floor?

Vegas has Several AS/400 servers with multiple backbones that control the entirety of each gaming resort group of casinos. On top of this, all System i servers have support contracts with IBM. The standard is for the IBM tech to show up at the data center before anybody notices anything went down.

Just curious....
Qr7
Posts: 184
Joined: Mon Apr 09, 2001 7:00 am

Post by Qr7 »

^misantropia^ wrote:
Qr7 wrote:why the fuck does the whole thing run on one machine. thats a terrible design.
I didn't read the whole thread (yet) but to answer your post: heaps of applications don't scale beyond a single-machine setup. People* who think it's just a matter of hooking up a couple more servers are plain and demonstrably wrong.

* In my professional life this often equates to "managers" or "sales reps".
and heaps of applications are shit. if you have mission critical applications such as this, you should design it to scale across multiple machines, or at least to have a failback.

Where I work, if we lost power in a DC, another DC would pick right up. I can go unplug a whole rack and no one would really care. at any point we can have up to 5% of our machines be down and no one really cares.

Hardware is cheap, and building parallel communication into an application is getting easier and easier. So don't give me this 'apps don't scale'.

Do you think your bank runs on 1 server? lol.
Nightshade
Posts: 17020
Joined: Fri Dec 01, 2000 8:00 am

Post by Nightshade »

lol, Queer7's upset again. :olo:
Qr7
Posts: 184
Joined: Mon Apr 09, 2001 7:00 am

Post by Qr7 »

Nightshade wrote:lol, Queer7's upset again. :olo:
once again you add nothing to the conversation and resort to attacks.
Post Reply