When no documentation is better than a bad one.
I’ve just returned from Vladivistok where I spent a day replacing a battery in Sun’s SE 6120 disk array. What could be easier than that? True, unless you’ve been misguided by a broken documentation. Here is a quote from Sun/Oracle’s official document (Sun StorEdgeTM 6020 and 6120 Arrays System Manual):
Once a battery has been physically replaced in a given PCU and that PCU has been reinstalled in the tray, no further action is required. The system updates the battery FRU information as needed without operator intervention.
Piece of a cake – just swap a faulty battery and you’re good to go. Not really. When the battery was replaced “refresh -s” still complained that it was failed. “refresh -c” wasn’t a friend in that situation since if there is even a single faulty battery in a unit – the test would not start.
Just to be on a safe side I tried the second battery (all of them were original and new) and even a new PCU – but the end result was identical. Since I knew that the batteries were good I had to use special dot commands to fix that issue:
# sun # password: # .bat -c u1pcu2
Doing that I’ve just cleared the battery’s status so now “refresh -s” was reporting that it was “normal” and the battery started charging. As soon as it was completely charged
# .bat -i u1pcu2
was run to initialize battery warranty date and now it was time for “refresh -c” to place it under the test.
The end result – don’t blindly trust any documentation untill you’ve verified it through your experience.
P.S. I was told by the client that last time when they observed exectly the same behavior they simply turned of the array and all the dependent services.