Is it a stupidity or something else?
From time to time I receive job offers and do job market research to keep abreast and be well informed regarding the current state of IT sphere and what opportunities exist out there. As a result I have to sift through all kind of job offers and job descriptions which match “unix solaris san networking” keywords. And what I’ve noticed, actually I’ve made that assumption long-long before, that 99,999% of all offerings are alike and coincide with the same pattern:
- It usually starts with a mentioning of how big/cool their client is: the biggest/leading Bank/Investment Bank/Financial Organization/IT consultancy in the region/world, a FTSE/FORTUNE top 10/20/50/you name it company and never naming their client explicitly.
- Describes thoroughly what an ideal candidate should or must be obliged to. The list of required skills could go on and on beyond the horizon nevertheless only a small fraction will be used later. And that’s true for ten cases out of ten.
- Very often lacks a detailed description of your day-to-day responsibility and a daily work flow.
- Exceptionally rarely tries to even mention the benefits and preferences, not only monetary ones, that an employee could expect in return for his/her diligence.
In the end, such awkward approach causes incredulity and raises even more questions. How on earth am I supposed to get interested in a job offer when it enumerates only my obligations and gives nothing back? Why do you refrain from exposing your client’s name? What’s about the compensation’s range? To find answers to these and the other questions I must spend my free and unpaid time just to worm out in the end that I don’t find this offer alluring. So what’s the point? Please, I beg you, dear and beloved HRs, save your and out time, post all key materials and all crucial information, like mentioned above, regarding a job offer you keep on your hands in advance. Period.
I’m done!
Are we going to say goodbye to Sun branded USP V/VM?
Just received it today:
2 March 2010
Dear Valued Partner,
Due to the recent acquisition of Sun Microsystems by Oracle Corporation, there has been much speculation as to the effect the merger will have on the market, product offerings and partnerships. As you are aware, Hitachi Data Systems and Sun Microsystems have enjoyed a successful business partnership. On March 31, 2010, the current distribution agreement that Hitachi Data Systems and Sun Microsystems have been jointly operating under for the past nine years will come to an end.
This relationship has given our partners access to industry-leading storage solutions built on Hitachi technology on which many of the world’s top enterprises have come to rely. With the acquisition of Sun Microsystems, Hitachi Data Systems and Oracle agree that the time is right to evolve this relationship into one reflecting the priorities of the new company. We are jointly determining the positioning of the products and solutions based on Hitachi Data Systems that you have deployed with clients. We understand you and your customers have questions and concerns surrounding service obligations to the global install base moving forward.
Hitachi Data Systems will be answering all questions and concerns with solid transition programs and will focus on meeting the demands of the continued excitement in the marketplace around the Hitachi Data Systems technology and the unique leading edge solutions that the Hitachi Data Systems brand has, and will continue to bring to market. These solutions will continue to be made available to you and your customers under the Hitachi Data Systems brand name.
Details will be forthcoming on programs and processes that will help guide you and your customers, as we transition this business moving forward. A new chapter is here, and Hitachi Data Systems sees great opportunities for you that will materialize in the market. Protecting, developing and growing your business is our top priority.
Nothing is clear from this letter and the consequences of these steps are misty. But ironically, this new turn in the evolution of long evolving relations between Sun and Hitachi is planed for the 1st of April or all fool’s day.
Expanding Solaris filesystem
Of course, you know about growfs command which is a key tool if you’re aiming at expanding a UFS file system in non-destructive way. But sometimes you may see the following error message as a result of growfs attempt:
failed to disable logging
The reason why this error happens is not obvious though. From the source code it could happen when the return value from rl_log_control() is not RL_SUCCESS. But if we go deeper into rl_log_control()’s internals the actual reason is quite obscure. Since rv is declared as RL_SUCCESS in the very beginning of the function there are only two options left where it could return RL_SYSERROR:
rl_result_t rv = RL_SUCCESS; if (alreadymounted == RL_TRUE) fd = open(li.li_mntpoint, O_RDONLY); else fd = open(li.li_tmpmp, O_RDONLY); if (fd == SYSERR) { perror("open"); rv = RL_SYSERR; goto out; } fl.nbytes_requested = 0; fl.nbytes_actual = 0; fl.error = FIOLOG_ENONE; if (ioctl(fd, request, &fl) == SYSERR) { perror("ioctl"); (void) close(fd); rv = RL_SYSERR; goto out; }
Unfortunately I wasn’t able to dtrace the issue more thoroughly because stopping xntpd and syslog helped to resolve the problem and all the tries to reproduce the same behavior on our testbed system did meet with success.
Update 1
Actually I’ve just successfully reproduced the faulty behavior and going to fiddle with dtrace tomorrow.
Update 2
Found a related bug id 6625306
Update 3
In the end, I didn’t reached the root cause of the problem but from what I’ve observed I could confirm, as described in the aforementioned bug id, that the error “failed to disable logging” is misleading. Below is the actual sequence of called functions:
mkfs.c:growinit() | V roll_log.c:rl_log_control() | V ioctl.c:ioctl() | V vnode.c:fop_ioctl() | V ufs_vnops.c:ufs_ioctl() | V ufs_log.c:ufs_fiologenable() | V lufs.c:lufs_disable()
Armed with dtrace it was easy to verify that a return value from ioctl functions was 0 or in other words successful and truss’s output only accords with that:
5858: open("/", O_RDONLY) = 5 5858: ioctl(5, _ION('f', 88, 0), 0xFFBFC490) = 0 5858: ioctl(5, _ION('f', 72, 0), 0xFFBFC4AC) = 0 5858: close(5) = 0
But the net result was still negative and I had to stop xntpd to grow the root file system.
Here are the components which add up to the issue:
- Solaris 10 10/09 s10s_u8wos_08a SPARC
- 5.10 Generic_142900-02 sun4u sparc
- Veritas-5.0_MP3_RP2
- Encapsulated root disk
Update 4
I believe this is going to be the last update in the series of inconsistent postings regarding “failed to disable logging” error. What I overlooked yesterday is the following check in the roll_log.c:
if (((request == _FIOLOGENABLE) && (!logenabled)) || ((request == _FIOLOGDISABLE) && logenabled)) rv = RL_FAIL;
As I have already mentioned before, the second ioctl which checks _FIOISLOG returned success in my case. But what I didn’t check is the logic hidden behind it, partly because I misinterpreted a comment to _FIOISLOG definition in sys/filio.h, which says:
#define _FIOISLOG _IO('f', 72) /* disksuite/ufs protocol */
I agree it’s a lame excuse. Anyway, here is a code from ufs_log.c:ufs_fioislog()
/* * ufs_fioislog * Return true if log is present and active; otherwise false */ /* ARGSUSED */ int ufs_fioislog(vnode_t *vp, uint32_t *islog, cred_t *cr, int flags) { ufsvfs_t *ufsvfsp = VTOI(vp)->i_ufsvfs; int active; active = (ufsvfsp && ufsvfsp->vfs_log); if (flags & FKIOCTL) *islog = active; else if (suword32(islog, active)) return (EFAULT); return (0); }
So, after running a trivial dtrace script I just confirmed what I should have noticed straight away:
#!/usr/sbin/dtrace -s #pragma D option quiet fbt:ufs:ufs_fioislog:entry /stringof(args[0]->v_path)=="/"/ { self->path=stringof(args[0]->v_path); printf ("Vnode path is %s\n", self->path); } fbt:ufs:ufs_fioislog:return /self->path=="/"/ { trace (args[1]); self->path=0; }
That “failed to disable logging” error message is valid and not misleading but as correctly noticed in bug:
lufs_enable()/lufs_disable() used by the ufs_ioctl()’s _FIOLOGENABLE and _FIOLOGDISABLE
do report errors while attempting to enable/disable the ondisk log via a corresponding
structure fiolog_t defined in ufs_filio.h and sometimes in addition with a real
error return value returned by to ufs_ioctl()
I should’ve been more vigilant and should’ve checked against fiolog_t structure correctly because initially I made a mistake in the script. So finally, with
#!/usr/sbin/dtrace -s #pragma D option quiet fbt:ufs:lufs_disable:entry { self->v_path=stringof(args[0]->v_path); self->fiolog=args[1]; printf ("Vnode path - %s\n", self->v_path); } fbt:ufs:lufs_disable:return /self->v_path=="/"/ { printf ("Return value - %d \nfiolog->error - %d", args[1], self->fiolog->error); exit(0); }
I received the following result:
Vnode path - / Return value - 0 fiolog->error - 4
So in the end the whole picture has become more clear – we failed in attempt to write-lock the file system.
#define FIOLOG_ENONE 0 #define FIOLOG_ETRANS 1 #define FIOLOG_EROFS 2 #define FIOLOG_EULOCK 3 #define FIOLOG_EWLOCK 4 #define FIOLOG_ECLEAN 5 #define FIOLOG_ENOULOCK 6
Why did it fail to write-lock? Well, further dtracing ufs_lockfs.c:ufs__fiolfs() revealed that the mount device was simply busy since this function on return set return value to 16 which is, according to errno.h, means:
#define EBUSY 16 /* Mount device busy */
Ooh!
Footsteps of spring
Hooray, finally it could be distinctly heard marching down a corduroy and once you’re left with the last sheet of a tear-off calendar in you palm you could crumple it and let the spring in. In the end, you don’t have a choice since tomorrow is the first day of her reign. So you’d better strain your ears to hear her breath in the dripping of melted snow and breathe her in with the first beam of the sun. It’s a springtime!
But today is still February so I spent several hours in the skating-rink with my son teaching him to stand firmly and confidently on the skates. It may sound unbelievable but it was empty and we were the only skaters. When I was a kid myself there used to be a lot of boys and girls, alone or with their parents, just skating or playing hockey and we used to have the skating-rinks near almost every dwelling house. I do still remember how we fought like a crazy playing ice-hockey till the dark and it didn’t matter if the puck could be hardly seen. I said “used to” because now we prefer to party, to spend/kill time in the shopping-malls and to watch “dancing on ice” and may be that’s one of the reasons why we’ve fscked up the Olympic games.
VxVM is watching after you
Couple of days ago my colleague told me about one neat feature of VxVM 5.x that could be quite helpful in the field. Imagine a situation when a customer complains about VxVM misconfiguration and blames your team for a slime work. To prove him wrong, you could sift through VxVMs’ command log files to get a list of commands you typed during initial configuration. If a customer did something wrong himself and now is trying to shift the blame upon you these log files could be of invaluable help as well – just show where and when he made the mistake. The log files could me found in /etc/vx/log and are named /etc/vx/log/cmdlog and /etc/vx/log/cmdlog.number for the current and historic command logs respectively. There is a vxcmdlog(1M) command to give you some control over this feature.
One thing to keep in mind is that no every commands script are logged:
Most command scripts are not logged, but the command binaries that they call are logged. Exceptions are the vxdisksetup, vxinstall, and vxdiskunsetup scripts, which are logged.
Enjoy!
SunFire 4810 upgrade to Solaris 10u8 issue
Yesterday, I was doing a planned upgrade of SunFire 4810 to Solaris 10u8 and faced the following error just awhile after invoking “boot net – install”. It’s worth noting that prior to OS upgrade I’d successfully updated OBP to the latest release – 5.20.14.
SunOS Release 5.10 Version Generic_141444-09 64-bit
Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
FATAL: PROM_PANIC[0x0]: assertion failed: TTE_IS_VALID(ttep), file: ../../../sun4u/gen/src/hat_sfmmu.c, line: 741
debugger entered.
Both Google and SunSolve led me to the following bug but in my case I wasn’t installing with ZFS root.
Eventually, I used verbose boot with the full path instead to the network device to jumpstart the server but I’m not sure if the original issue wasn’t caused by a moon phase, sun flare or solar wind.
boot /ssm@0,0/pci@18,700000/pci@1/SUNW,hme@0,1 -v – install
Going to double check on another SF4810 since I didn’t have enough time during the last maintenance window.
Update
So I played a bit with another SF4810 and did exactly the same steps in hope to cause similar error as described above but everything I had done was result-less and vain. Sifting through the code I only dug out that assertion fails when tte_inthi > 0 (???). PTE (Page Table Entry) related code:
typedef union { struct tte { uint32_t v:1; /* 1=valid mapping */ uint32_t sz:2; /* 0=8k 1=64k 2=512k 3=4m */ uint32_t nfo:1; /* 1=no-fault access only */ uint32_t ie:1; /* 1=invert endianness */ uint32_t hmenum:3; /* sw - # of hment in hme_blk */ uint32_t rsv:7; /* former rsv:1 lockcnt:6 */ uint32_t sz2:1; /* sz2[48] Panther, Olympus-C */ uint32_t diag:1; /* See USII Note above. */ uint32_t pahi:15; /* pa[46:32] See Note above */ uint32_t palo:19; /* pa[31:13] */ uint32_t no_sync:1; /* sw - ghost unload */ uint32_t suspend:1; /* sw bits - suspended */ uint32_t ref:1; /* sw - reference */ uint32_t wr_perm:1; /* sw - write permission */ uint32_t exec_synth:1; /* sw bits - itlb synthesis */ uint32_t exec_perm:1; /* sw - execute permission */ uint32_t l:1; /* 1=lock in tlb */ uint32_t cp:1; /* 1=cache in ecache, icache */ uint32_t cv:1; /* 1=cache in dcache */ uint32_t e:1; /* 1=side effect */ uint32_t p:1; /* 1=privilege required */ uint32_t w:1; /* 1=writes allowed */ uint32_t g:1; /* 1=any context matches */ } tte_bit; struct { int32_t inthi; uint32_t intlo; } tte_int; uint64_t ll; } tte_t; #define tte_inthi tte_int.inthi #define TTE_IS_VALID(ttep) ((ttep)->tte_inthi < 0)
I wish there was more clear and precise description of tte_int.inthi and tte_t.ll in the source code.
Moving ZFS disk into Solaris with UFS
Doing another jumpstart installation I stumbled upon the following error:
Processing profile - Opening Flash archive - Validating Flash archive - Selecting all disks - Configuring boot device ERROR: The boot disk (c0t0d0) is not selected ERROR: Flash installation failed Solaris installation program exited.
An obvious decision was to check the disk with format utility. Printing its partition information I saw the following picture:
Part Tag Flag First Sector Size Last Sector 0 unassigned wm 0 0 0 1 unassigned wm 0 0 0 2 backup wm 34 33.91GB 71116540 3 unassigned wm 0 0 0 4 unassigned wm 0 0 0 5 unassigned wm 0 0 0 6 unassigned wm 0 0 0 7 unassigned wm 0 0 0 8 reserved wm 71116542 8.00MB 71132925
“How come I have nine partitions”, that was my first reaction?! And just in a jiffy I recalled that this disk used to was a part of ZFS pool and of course it had been formatted with EFI label. So, how to change it back from EFI to SMI label. To successfully accomplish that I had to run “format -e”, because without “-e” you could change nothing.
# format -e c0t0d0 partition> l [0] SMI Label [1] EFI Label Specify Label type[1]: 0 Auto configuration via format.dat[no]? y partition> print Current partition table (default): Total disk cylinders available: 24620 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 0 (0/0/0) 0 1 swap wu 0 0 (0/0/0) 0 2 backup wu 0 - 24619 33.92GB (24620/0/0) 71127180 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 usr wm 0 - 24619 33.92GB (24620/0/0) 71127180 7 unassigned wm 0 0 (0/0/0) 0 partition> q
Sorted. Now I was able to continue my interrupted jumpstart.
Send a complete zfs pool
Imagine, that you have a zfs pool with dozens of zfs datasets which you need to migrate to another box. “How do I accomplish that” is a reasonable question. Actually, that’s dead easy with zfs send/receive mechanism:
# zpool list zones NAME SIZE USED AVAIL CAP HEALTH ALTROOT zones 67.5G 30.5G 37.0G 45% ONLINE -
Firstly create a recursive snapshot and once it’s over you could send/receive the resulting data stream to another host:
# zfs snapshot -r zones@zones_snapshot # zfs send -R zones@zones_snapshot | ssh user@host "zfs receive -Fd zones"
Scanty vocabulary
I do believe that nowadays English language is hugely simplified in day-to-day use. Just listen to this:
Jumping from FC10 to FC11
Finally I had enough spare time to sit down and carefully upgrade my VPS to Fedora Core 11 using the recommendations to steer clear of any possible issues. Needless to say everything went without a hitch. The only thing that should be redone is clamav which I setup in hast but that’s trivial.