Thursday, November 02, 2006

OpenOffice icon fix for Solaris

One thing that has always bugged me when I install Solaris and add OpenOffice.org 2.x, is that the icons for OpenOffice.org never appear next to their menu entries in JDS. I know it is not life threatening, or stops the application for working. It just looks bad!

A quick and dirty fix to this is to change the "Icon" entry in the relevent .desktop files in /usr/share/applications. For a slightly quicker and cleaner fix I have written a small script which just adds symbolic links to the installed icon files. The script does this by adding the links directly to the OpenOffice.org desktop integration installed package. This ensures that the links are removed when the package is removed.

#!/bin/sh
# Usage: openoffice-icon-fix
#
# Currently OpenOffice.org for Solaris is delivered with the wrong filenames
# for the icons in the .desktop files. This means the icons will not show
# up in the JDS menus. To work around this, this script adds a link to the
# original file the Solaris package database, which in-turn creates a
# physical link in the filesystem (via installf -f)
#

PKGINST=openofficeorg-desktop-integratn
APPS="writer printeradmin math impress draw calc base"
PIXDIR=/usr/share/pixmaps
OLD=openoffice.org-2.0
NEW=openofficeorg-20

#
# Check to see if OpenOffice.org is installed.
#
/usr/bin/pkginfo | /usr/bin/grep ${PKGINST} >/dev/null || {
echo "${PKGINST} is not installed."
echo "Please install OpenOffice.org 2.x, before running this command"
exit 1
}

#
# Note: The trailing 's' on the end of the installf command is NOT a typo!
#
for i in ${APPS}
do
/usr/sbin/installf ${PKGINST} ${PIXDIR}/${NEW}-${i}.png=${OLD}-${i}.png s
done

/usr/sbin/installf -f openofficeorg-desktop-integratn

Good Hunting!

Saturday, October 21, 2006

VOLD is dead. Long live Tamarack

After a quick check of the recent ON nevada changelog, I saw that one of my pet Solaris hates "vold" has been deleted and replaced with "PSARC case 2005/399 : Tamarack: Removable Media Enhancements in Solaris". Sun THANK YOU!!!

Not only is vold gone, but you get hal, and joliet extensions thrown in to the mix. A quick download and bfu of the on-bfu-20061016 archive, and this is what it looks like to mount a DVD backup of my photographs burnt from MS Windows, and a memory stick -

root@bangkok> uname -a
SunOS bangkok.priv 5.11 opensol-20061016 i86pc i386 i86pc
root@bangkok> df -h -F hsfs
Filesystem size used avail capacity Mounted on
/dev/dsk/c1t0d0s2 2.7G 2.7G 0K 100% /media/Photos10042006
root@bangkok> df -h -F pcfs
Filesystem size used avail capacity Mounted on
/dev/dsk/c2t0d0p0:1 123M 173K 123M 1% /media/MEM128

As you can see no more /vol, /cdrom, /rmdisk, just the real device name, and the /media mountpoint. Also all the filenames on the DVD are humanly readable.

Saturday, October 07, 2006

INQ thinks Quad cores too many cores for games.

After speaking to "many developers including some big names", the Inquirer said that you should forget quad core CPU's for games!

What a load of rubbish. Plenty of games impress me that who ever built the game is obviously a gifted programmer. The following narrow minded statement, if it did come from real games programmers, does makes me wonder.
You can keep one core busy with the physics and collision detection, second core will have to wait for the score to move on with the Artificial intelligence while the third core could possible calculate the graphic data. In this best case scenario you have to realise that the core number two and three would always have to wait for the core number one to finish its job and pass the job to the cores two and three. In this concept there is absolutely no place for quad core as games are non parallel applications

Most games I play, I am not the only one playing. If it is not another human or humans, then you play against several computer 'AI' characters. The whole idea the each section of the game is bound to a particular core just stinks of a bad programming model. Maybe they have never heard of threading (or just scared of it). To say that "game are non parallel applications", is just grossly incorrect.

Lets take an old favorites like Galaga. Very simple... There is my spaceship at the bottom of the screen. It can fire several missiles at about 20 enemy spaceships. Now if you treat my space ship, the missiles, and the enemy space ships as different objects and assign at least one thread to each. You end up with a very parallel application, where the more cores the merrier.

Is it a case of gaming programmers still using 1980's programming tools/models, or is it just another bad INQ article. I am sure it is the later.

Wednesday, September 20, 2006

Its a Coup

Last night I have just sat through my first coup. All the action was only 5km's away and it just seemed like another Tuesday night. The first I knew about it was when I tried to access some of the online Thailand newspaper and they responded after a long wait with lots of PHP/MYSQL errors. A quick flick through the television channels were exactly the same. Pre-recorded music and images of the King.

Luckily, I could still access the CNN and BBC websites to find what was going on. Through the whole thing there was the sound of 'booms' in the distance. Luckily, nobody was shooting at each other, and it was just the usual Bangkok thunderstorm.

Today, some of the newspapers are saying the coup will not affect tourism. I would not be that optimistic, though if you are comming to Thailand, then still come. It is safe! Tomorrow, things should be back to normal.

What lessons could be learnt from this? Maybe the online newspapers need to upgrade some of their hardware. They didn't cope.... I am sure some T2000's would :)

Wednesday, September 13, 2006

Almost.... But no cookie today.

Recently one of my girlfriend's friends decided to do a computer course. So now she wants to buy a new computer. Cool lets go shopping. After walking around the many levels of the infamous Pantip Plaza in Bangkok, she decided she wanted a HP. Cool!

My first assumption was that since it is a branded computer, it will come with the usual Microsft Virusware installed. Looking through the documentation though, I find that it only comes with a "Free DOS Operating System". Wow, times have changed.... but DOS just makes it totally useless. HP, why not at least install a free Linux Distribution?.

Rather, than buying Windows, lets look at the options -
    Solaris Express - That would be cool. I have the DVD...
    Ubuntu - Quite nice for beginner. I also have the DVD...
    Linux TLE - Very good for her since she really only speaks Thai

So, I ask the question. "What computer course are you doing?" - Windows..... DOH!!!!

JDS has new fan

When a friend came to visit the other day, she asked if she could use my computer for a while. Hmmmm, no problems. This is a Windows only person, and my laptop only has OpenSolaris, no Microsoft Windows. I won't say a word and see what happens. Ok, so I went and watched a Thai soapy on TV (Contains plenty of hate, violence, blood and guts. Absolutely no sex! - strange logic), while she typed away.

After she had finished she commented "I really like your computer. It is nice and easy to use". Wow, It looks like the JDS team have done a good job.

Friday, September 01, 2006

Chinese Pirates

According to a report in the Bangkok Post, "Nearly half of all books, films, music CDs and software sold in China are illegally copied". Now I find it very hard to believe that it could be less than 50%. The Chinese are much smarter than that!

I have always had the view that if the sticker price is much higher than the price to reproduce, than you will always have illegal copies being made. For software, I would love for Microsoft (and others) to find a secure way of protecting their licensing of their software. This would force people to look at the alternative free Open Source software such as Solaris, BSD, Linux and Open Office. Companies like Microsoft using the old "high admission fee" model will one day have to change their ways. They maybe able to hold their market share for a little longer in the developed world, but there is a much larger market in the developing world who are much more price sensitive.

Wednesday, August 30, 2006

The missing benchmarks

When puchasing computer hardware, unless you get your hands on the equipment before you purchase, you have to rely on past experience, experience of others, or benchmarks, to see if they have enough capacity for your users needs.

There is a large wealth of information you can use to judge server capacity. There is also plenty of information for storage capacity. One place that is almost always overlooked is network devices. I have seen many people in the past, blindly using the vendors advertised specifications to determine their needs. The assumption that they have Gig ports on a switch and the vendor claims the backplane is "HUGE", does not mean that you can assume that the actual performance will be "wire" speed.

If you are in the market for network hardware at the moment, before you go too far you should have a sober look at Simon Bullen's blog on Cisco's 3750. I have always had my doubts about Cisco equipment in the past, and the reasons why people choose it over other vendors. It would be nice if we could have somewhat neutral benchmark standards body like www.spec.org to produce a set of network equipment benchmarks. My searching so far only finds vendors (or their proxies) competitive benchmarking their opposition, and of course whipping their ass. You would not expect less.

While independent results from people like Simon are fantastic. It would be nice to see network vendors put their crown jewels on the benchmarking chopping block for all to see. The nice thing about this is if they don't post results for certain equipment, they may be selling you a brumby.

Thursday, August 17, 2006

PR Nightmare

Is there a breakout of foot-in-mouth disease at IBM lately, or have do they just have a drug problem? I know the Solar System is currently being redefined, but what planet are they on?

Their recent comments on Sun open source creditials do make you wonder just what are they thinking. Due to their work on the Linux kernel, Apache, Eclipse etc, IBM have been able to enjoy the admoration of the open source community, while at the same time keeping their products very closed. To now it has been very clever marketing. The problem they have now is that some commentators are starting to see through this, and are asking why IBM does not open source its software and hardware like Sun Microsystems has been doing.

The obvious reply to this to avoid the focus on IBM's closed products, would have actually been to acknowledge Sun's effort, and focus on how long it takes to change a product of the size of Solaris and AIX from multiple closed source licenses to an open source license. Instead they have decided to attack the Open Solaris and Open Sparc projects stating that they are not truely open. This is just total bullshit!

Rather than hosing down the question, giving the commentators little to write about, they have decided to give the press a juicy story. This would be fine if they had a large swag of products based on open source. The fact is that they have not, and the press have now alerted to a renewed battle between IBM and Sun with IBM standing on very shakey ground, surrounded by mountains of closed source and hardware.

IBM's current statements are probably an indication of they think Sun's push to open source not only it's entire software stack, but also hardware, is a medium to long term threat to IBM position. It looks from what they are saying, is at the moment they are scratching themselves to find a solution. Using verbal attacks on Sun is just plain counter productive. I think the future may be tough for companies that dabble in open source while at the same time keeping a very closed product line. Hypocrisy does not further their cause.

What can IBM do to get themselves out of this mess? I think a good start would be to avoid opening their mouths, and start work on a timetable to start opening up their products. They state that their customers are not interested in AIX enough, but I find this very hard to believe. When Solaris was re-released onto x86 hardware, they stated that the was not enough interest from their customers to release software for the platform. Their stance changed very quickly when they got a tap on the shoulder from some of their largest customers. I think if they are seeing little interest from their customers, then this is a more of a worrying sign of the future of AIX. I hope this is not the case.

Sunday, August 13, 2006

Don't be afraid of mdb - cont

I thought I would just quickly show why I did not use any "optimize" flags when I compiled my example code. At the same time I show the er_src program which part of Sun Studio.

First I compile the program with the '-g' flag. This will give er_src access to the source code.
doug@bangkok> cc -g -o ./makecore ./makecore.c
Next run the program and do a pstack. Everything looks fine
doug@bangkok> ./makecore
Memory fault(coredump)
doug@bangkok> /bin/pstack core
core 'core' of 102246: ./makecore
08050775 fruitloop (8060958, 804710c, 80507bb, 8047210, 804712c, 80506ca) + 15
08050794 giveitatry (8047210, 804712c, 80506ca, 1, 8047138, 8047140) + 14
080507bb main (1, 8047138, 8047140) + b
080506ca _start (1, 8047278, 0, 8047283, 8047290, 80472ca) + 7a
Ok, lets have a look at what er_src does
doug@bangkok> /opt/SUNWspro/bin/er_src -func ./makecore

Functions sorted in lexicographic order

Load Object:

Address Size Name

0x000005e0 16 @plt
0x000005f0 16 __fpstart
0x000006dc 123 __fsr
0x00000620 16 _exit
0x000007ec 27 _fini
0x00000640 16 _get_exit_frame_monitor
0x000007d0 27 _init
0x00000650 139 _start
0x00000610 16 atexit
0x00000600 16 exit
0x00000630 16 printf
0x00000760 30 fruitloop
0x00000780 38 giveitatry
0x000007b0 29 main

doug@bangkok> /opt/SUNWspro/bin/er_src -disasm fruitloop ./makecore
Annotated disassembly
---------------------------------------
Source file: ./makecore.c
Object file: ./makecore
Load Object: ./makecore
1. #include
2. #include
3.
4. static void
5. fruitloop(){

[ 5] 8050760: pushl %ebp
[ 5] 8050761: movl %esp,%ebp
[ 5] 8050763: subl $4,%esp
6. char *p;
7. p=(char *)NULL;
[ 7] 8050766: movl $0,-4(%ebp)
8. *p='c';
[ 8] 805076d: movl $0x63,%eax
[ 8] 8050772: movl -4(%ebp),%edx
[ 8] 8050775: movb %al,0(%edx)
[ 8] 8050778: jmp .+4 [ 0x805077c ]
[ 8] 805077a: nop
[ 8] 805077b: nop
9. }
[ 9] 805077c: leave
[ 9] 805077d: ret
10.
11. static void
12. giveitatry(){

[12] 8050780: pushl %ebp
[12] 8050781: movl %esp,%ebp
[12] 8050783: subl $4,%esp
13. char *msg="Ahh we made it!\n";
[13] 8050786: leal 0x8060958,%eax
[13] 805078c: movl %eax,-4(%ebp)
14.
15. fruitloop();
[15] 805078f: call fruitloop [ 0x8050760, .-0x2f ]
16. (void)printf(msg);
[16] 8050794: movl -4(%ebp),%eax
[16] 8050797: pushl %eax
[16] 8050798: call printf [ 0x8050630, .-0x168 ]
[16] 805079d: addl $4,%esp
[16] 80507a0: jmp .+4 [ 0x80507a4 ]
[16] 80507a2: nop
[16] 80507a3: nop
17. }
[17] 80507a4: leave
[17] 80507a5: ret
18.
19. int
20. main(int argc, char **argv){

[20] 80507b0: pushl %ebp
[20] 80507b1: movl %esp,%ebp
[20] 80507b3: subl $4,%esp
21. giveitatry();
[21] 80507b6: call giveitatry [ 0x8050780, .-0x36 ]
22. return(0);
[22] 80507bb: movl $0,-4(%ebp)
[22] 80507c2: jmp .+6 [ 0x80507c8 ]
[22] 80507c4: jmp .+4 [ 0x80507c8 ]
[22] 80507c6: nop
[22] 80507c7: nop
23. }
[23] 80507c8: movl -4(%ebp),%eax
[23] 80507cb: leave
[23] 80507cc: ret
Now, that is really nice. The C program is listed with the assembly code for the corresponding C code. It makes a very nice assembly language tutorial. Now, lets do the same, but compile using the "-fast" flag. (-fast is actually a macro for several other flags. It is known to generally give the best optimized code for your system with the least effort)
doug@bangkok> cc -fast -g -o ./makecore ./makecore.c
doug@bangkok> ./makecore
Memory fault(coredump)
doug@bangkok> /bin/pstack core
core 'core' of 102283: ./makecore
08050850 main (1, 8047278, 0, 8047283, 8047290, 80472ca)
Hmmm, this time it stopped in function main. Lets look at what -fast did to the code.
doug@bangkok> /opt/SUNWspro/bin/er_src -disasm fruitloop ./makecore
Annotated disassembly
---------------------------------------
Source file: ./makecore.c
Object file: ./makecore
Load Object: ./makecore
1. #include
2. #include
3.
4. static void
5. fruitloop(){
6. char *p;
7. p=(char *)NULL;
8. *p='c';

[ 8] 8050820: movb $0x63,0
9. }
[ 9] 8050827: ret

[ 8] 8050830: movb $0x63,0
10.
11. static void
12. giveitatry(){
13. char *msg="Ahh we made it!\n";
14.

Function fruitloop inlined from source file ./makecore.c into the code for the following line. 0 loops inlined
15. fruitloop();
16. (void)printf(msg);
[16] 8050837: subl $8,%esp
[16] 805083a: pushl $0x80609f4
[16] 805083f: call printf [ 0x8050630, .-0x20f ]
[16] 8050844: addl $0xc,%esp
17. }
[17] 8050847: ret

[ 8] 8050850: movb $0x63,0
[16] 8050857: subl $8,%esp
[16] 805085a: pushl $0x80609f4
[16] 805085f: call printf [ 0x8050630, .-0x22f ]
[16] 8050864: addl $0xc,%esp
18.
19. int
20. main(int argc, char **argv){

Function giveitatry inlined from source file ./makecore.c into the code for the following line. 0 loops inlined
Function fruitloop inlined from source file ./makecore.c into inline copy of function giveitatry. 0 loops inlined
21. giveitatry();
22. return(0);
[22] 8050867: xorl %eax,%eax
[22] 8050869: ret
23. }
As you can read from the comments, both of the functions were inlined. Therefore they are now part of the 'main' function. The 'er_src' program is really neat app. Lets see the comment change when we tell it not to inline.
doug@bangkok> cc -fast -g -xinline=no%fruitloop -o ./makecore ./makecore.c
doug@bangkok> /opt/SUNWspro/bin/er_src -disasm fruitloop ./makecore
Annotated disassembly
---------------------------------------
Source file: ./makecore.c
Object file: ./makecore
Load Object: ./makecore
.
.
5. fruitloop(){
6. char *p;
7. p=(char *)NULL;
8. *p='c';

[ 8] 8050820: movb $0x63,0
9. }
[ 9] 8050827: ret
10.
11. static void
12. giveitatry(){
13. char *msg="Ahh we made it!\n";
14.

Function fruitloop not inlined because user explicitly requested that it not be inlined
If you are playing around with optimizing code, then er_src is one tool you should use.

Have Fun!

Don't be afraid of mdb

Many Solaris system admins or developers would know that Solaris has some very good debugging tools. Most sysadmins would know there is a command called mdb. Sadly most would have either never used it, or was scared off when they scanned through the documentation. While using mdb does require a good knowledge of the Solaris internals, and some assembly language skills, there are times where it is probably the only (or best) tool for the job.

Consider the case where you have an application that your company has been using for a long time. Something has changed on the system, and now it crashes when it is run. Since the person who wrote the application now does not work for your company anymore and nobody knows where the source code is, you have a problem. To make things worst, when you do a pstack on the core file, you find that they have “stripped” the binary of its symbol table to save a few bytes. Your options are now really limited to do any useful debugging. Enter 'mdb'....

Now to simulate this I have created a small C program, with a null pointer buried in a couple of functions. I compile the program (not using any optimizations as the compiler will inline all of the functions as they are very small), and then run the strip command on it. During running the program we get not a very useful error message, and a core dump. Argggg!

Running a pstack on the binary, because it was stripped, pstack returns an address with “????????”, as the function name. Ah, it is now turned into a challenge.
doug@bangkok> /bin/pstack core
core 'core' of 101996: ./makecore
080506d5 ???????? (8060898, 80470fc, 805071b, 804720c, 8047124, 805062a)
080506f4 ???????? (804720c, 8047124, 805062a, 1, 8047130, 8047138)
0805071b main (1, 8047130, 8047138) + b
0805062a _start (1, 8047274, 0, 804727f, 804728c, 80472c6) + 7a
You can get the similar output from the “::stack” command, within mdb.
doug@bangkok> mdb core
Loading modules: [ libc.so.1 ld.so.1 ]
> ::stack
0x80506d5(8060898, 80470fc, 805071b, 804720c, 8047124, 805062a)
(804720c, 8047124, 805062a, 1, 8047130, 8047138)
main+0xb(1, 8047130, 8047138)
_start+0x7a(1, 8047274, 0, 804727f, 804728c, 80472c6)
Since there is nothing in human readable form, at this point most people would look elsewhere or through it in the too hard basket. If you know a little assembly language (32bit x86 in this case), you should probably continue on. A good starting point would be the assembly listing of the function where it bombs out. The first address “0x80506d5” is for the instruction where we bombed out. Doing a disassemble backwards from this address is tedious, especially if this instruction is a long way from the beginning. The address on the next line “0x80506f4” is actually more useful. It is the return address of the function, which should be the next instruction after the function call. The function calling code should be immediately before this. Lets attack it with the disassembler built into 'mdb' byte by byte.
> 80506f4::dis
0x80506f4: movl -0x4(%ebp),%eax
> 80506f3::dis
0x80506f3: decl 0xe850fc45(%ebx)
> 80506f2::dis
0x80506f2: ***ERROR--unknown op code***
> 80506f1::dis
0x80506f1: ***ERROR--unknown op code***
> 80506f0::dis
0x80506f0: int $0x3
> 80506ef::dis
0x80506ef: call -0x34 <0x80506c0>
Bingo! We have a winner - 0x80506c0. You will probably notice, the "call" op-code (1 byte) was followed by a 4 byte address, so we could have first tried the addess – 5. In my case the command inside of mdb would have been “80506f4-5::dis”.

Now we have an address, we can now easily list the function from the start.
> 80506c0::dis
0x80506c0: pushl %ebp
0x80506c1: movl %esp,%ebp
0x80506c3: subl $0x4,%esp
0x80506c6: movl $0x0,-0x4(%ebp)
0x80506cd: movl $0x63,%eax
0x80506d2: movl -0x4(%ebp),%edx
0x80506d5: movb %al,0x0(%edx)
0x80506d8: leave
0x80506d9: ret
0x80506da: nop
We can eithen add the function into mdb's user-defined symbol table, so we can now see symbolic names, rather than hex addresses. The rough comments we added by me :)
> 80506c0::nmadd -f -e 80506da badfunc
added badfunc, value=80506c0 size=1a
> badfunc::dis
badfunc: pushl %ebp ; save frame pointer to the stack
badfunc+1: movl %esp,%ebp ; copy stack pointer to frame pointer
badfunc+3: subl $0x4,%esp ; make room for the pointer - char *p
badfunc+6: movl $0x0,-0x4(%ebp) ; initialize pointer to null - p=(char*)NULL;
badfunc+0xd: movl $0x63,%eax ; copy 'c' to %eax register
badfunc+0x12: movl -0x4(%ebp),%edx ; copy pointer to register %edx - now = 0
badfunc+0x15: movb %al,0x0(%edx) ; *p = 'c' - Hmmm copy 'c' to address 0 - BAD!!!
badfunc+0x18: leave ; cleanup function call
badfunc+0x19: ret ; return to calling function
> ::stack
badfunc+0x15(8060898, 80470fc, 805071b, 804720c, 8047124, 805062a)
0x80506f4(804720c, 8047124, 805062a, 1, 8047130, 8047138)
main+0xb(1, 8047130, 8047138)
_start+0x7a(1, 8047274, 0, 804727f, 804728c, 80472c6)
> 80506f4-5::dis
0x80506ef: call -0x34
From a quick look at my disassembled code, it is clear that some idiot created a null pointer , and then tried to copy a byte to there. Not very bright eh! In a real world example you would probably need to run the command in mdb and set the breakpoint to the start of the function. From there you could step through the code to see what is does. It would go something like this -
doug@bangkok> mdb ./makecore
> 80506c0::nmadd -f -e 80506da badfunc ; Add our own symbol from above
added badfunc, value=80506c0 size=1a
> badfunc:b ; Set a breakpoint at the beginning of badfunc
> :r ; run ./makecore in the debugger
mdb: stop at badfunc
mdb: target stopped at:
badfunc: pushl %ebp
> :s ; Step through code, 1 step at a time
mdb: target stopped at:
badfunc+1: movl %esp,%ebp
> :s
mdb: target stopped at:
badfunc+3: subl $0x4,%esp
> :s
mdb: target stopped at:
badfunc+6: movl $0x0,-0x4(%ebp)
> :s
mdb: target stopped at:
badfunc+0xd: movl $0x63,%eax
> ::regs ; Check the registers - Hmmm. %edx = 0
%cs = 0x003b %eax = 0x08060898
%ds = 0x0043 %ebx = 0xfeffa7c0
%ss = 0x0043 %ecx = 0xfefa9768 libc.so.1`_sse_hw
%es = 0x0043 %edx = 0x00000000
%fs = 0x0000 %esi = 0x080470e0
%gs = 0x01c3 %edi = 0x08047204

%eip = 0x080506cd badfunc+0xd
%ebp = 0x080470e4
%kesp = 0x00000000

%eflags = 0x00000202
id=0 vip=0 vif=0 ac=0 vm=0 rf=0 nt=0 iopl=0x0
status=

%esp = 0x080470e0
%trapno = 0x1
%err = 0x0
To find the best reference on the guts of Solaris and how to make the best use mdb and other Solaris tools such as DTrace. Go and purchase the just released 2nd edition of Solaris Internals and it's new companion Solaris Performance and Tools. You can save 30% by buying the through Sun. While you are stacking your bookself, you should also consider Solaris System Programming, and Sun Performance and Tuning. Some light reading :)

Thursday, August 10, 2006

Filesystem Benchmarks

If you have been reading the zfs-discuss on OpenSolaris.org recently, you would have read that Robert Milkowski has been doing some benchmarks using Sun's StorageTek 3510 FC diskarrays. He has been getting some interesting results that suggest that using ZFS and the 3510 without the hardware RAID controllers is faster than using it with. This is very interesting because the cost of hardware raid controllers can be expensive. If it suits your needs, you can save some cash by using Solaris 10 and ZFS, as both are free!

Since I don't have have a 3510 sitting around to test on, I decided to do a quick benchmark on a spare partition of my laptop to compare ZFS and UFS. We have all been told that ZFS is faster than UFS, but by how much and when is a interesting question.

Using filebench as Robert did, I have started with the varmail workload using the average of three runs to produce the graph below. For each run I created the pool (ZFS) and filesystem, did the three benchmark runs for 60 seconds each, and then destroyed the partition for the next benchmark test.

For ZFS this was

root> zpool create -fm /none benchpool /dev/dsk/c0d0s4
root> zfs create benchpool/mnt
root> zfs set mountpoint=/mnt benchpool/mnt
root> # Set options zfs options e.g. zfs set atime=off benchpool/mnt
root> /opt/filebench/bin/filebench
filebench> load varmail
filebench> set $dir=/mnt/zfstest
filebench> run 60
root> zfs destroy benchpool/mnt
root> zpool destroy benchpool

For UFS -

root> newfs /dev/dsk/c0d0s4
root> mount -o noatime -F ufs /dev/dsk/c0d0s4 /mnt
root> # -o noatime is the option for this test
root> /opt/filebench/bin/filebench
filebench> load varmail
filebench> set $dir=/mnt/zfstest
filebench> run 60
root> umount /mnt

As you can see ZFS is indeed faster for this benchmark than UFS. To be fair, and to compare apples to apples, I should have combined UFS with the Solaris Volume Manager (SVM). This most likely seen a greater gap between ZFS and UFS. One thing it shows, is that a Acer Ferrari 4005 maybe a nice laptop, but it makes a horrible mailserver :(

Friday, July 21, 2006

How small can you make Open Solaris - Part 5

In the latest post I have just done a quick update of the "Quick and Dirty Solaris Installer". Version 0.3 of the installer is now able to exclude Solaris clusters and packages from the installation. With this we are now able to start reducing the size and try to get close to what was achieved in the first couple of posts.

If you read through the code you can see that I have started with the SUNWCmreq metacluster, as it is the smallest defined on the Solaris installation media. I actually had to add the package SUNWmdr, as devfsadmd started complaining during the first reboot. In the code you will see a big list of packages defined in the "exclude array". Since this list includes device drivers, you may have to add and remove driver packages for your system. To do this you will need to read through the '.clustertoc' file in the Product directory of your installation media.

Using the configuration in this script, I was able to get a bootable multi-user system which only needed /etc/nodename and /etc/hosts configured. The disk usage is down to around 149MB (still a long way to go). Solaris core packages actually have a lot of files and command which I would not call 'core'. Later on I will look at creating new packages, and pulling files directly from a Open Solaris bfu archive. This may also require some files being pulled from "running" Solaris, as the bfu archives are not complete. If we can successfully use a bfu archive, then we could later see if changing some of the compiler flags in an Open Solaris build will shrink the binary size.

In the next post, I will start reducing installation further, by adding commands to the local_install.bash script which can be run just before the filesystem is un-mounted.


#!/bin/bash
#
# Quick and Dirty Solaris installer
# Version 0.3
#
PROD=/cdrom/sol_11_x86/Solaris_11/Product
#PROD=/a/Solaris_11/Product
TOK=${PROD}/.clustertoc
ORDER=${PROD}/.order

# Current metaclusters are
# SUNWCXall - Entire Distribution plus OEM support
# SUNWCall - Entire Distribution
# SUNWCprog - Developer System Support
# SUNWCuser - End User System Suppor
# SUNWCreq - Core System Support
# SUNWCrnet - Reduced Networking Core System Support
# SUNWCmreq - Minimal Core System Support
METACLUSTER=SUNWCmreq
export METACLUSTER

SWAP=/dev/dsk/c0d0s1
LOG=/tmp/install.log
SVCPROFILE=generic_limited_net.xml

#
# Change these to relect your system.
#
FS=zfs
[ "${FS}" != "zfs" -a "${FS}" != "ufs" ] && {
printf "Unknown filesystem ${FS}\n"
exit 1
}

[ "${FS}" = "zfs" ] && {
ROOTDEV=intdisk/snv43_zfs
RAWROOTDEV=-
GRUBFS=/dev/dsk/c0d0s0
ZFSBOOTARCHIVE=/grub/boot/boot_zfs
ZFSCOMPRESS=off
# ZFSCOMPRESS=on
MOB=yes
}

[ "${FS}" = "ufs" ] && {
ROOTDEV=/dev/dsk/c0d0s3
RAWROOTDEV=${ROOTDEV/dsk/rdsk}
MOB=no
}

typeset -a pkgs
# extrapkgs is an array of additional packages to be installed
typeset -a extrapkgs=( SUNWmdr )
# exclude is an array of packages or clusters which should not be installed
typeset -a exclude=( SUNWChbaapi SUNWCfcadb SUNWCfca SUNWCfcadb SUNWCfct
SUNWCfutil SUNWCiscsi SUNWCib SUNWCmpapi SUNWCtavor
CADP160 HPFC SK98sol SUNWaac SUNWadp SUNWadpu320 SUNWamr SUNWcadp SUNWced
SUNWchxge SUNWcqhpc SUNWlsimega SUNWmv88sx SUNWnge SUNWrge SUNWrmodr
SUNWrmodu SUNWrtls SUNWses SUNWsi3124 SUNWuksp SUNWuedg SUNWukspfw
SUNWuprl SUNWxge SYMhisl SKfp SUNWintgige SUNWwbsup
SUNWCpkgcmds SUNWjss SUNWidnl SUNWbzip
SUNWperl584core SUNWperl584usr )

#
# check_install tests whether $1 is in the list of packages to be excluded
#
function check_install() {
local i
[ "$#" != "1" ] && return 0

for i in ${exclude[@]} ; do
[ ${i} = "$1" ] && return 1
done
return 0
}

#
# add_extra_pkgs will add in the extrapkgs array to the pkgs array. It will
# void adding a package twice
#
function add_extra_pkgs() {
pkgcnt=${#pkgs[@]}
for i in ${extrapkgs[@]} ; do
found=0
for j in ${pkgs[@]} ; do
[ ${i} = ${j} ] && {
found=1
break;
}
done
# Add package if not already found
[ "${found}" = "0" ] && {
printf "Adding Package - %s [%d]\n" "${i}" ${pkgcnt}
pkgs[$(( pkgcnt++ ))]="$i"
}
done
}

#
# Solaris packages need to be installed in the correct order.
# The .order file contains all the packages in the correct
# installation order
#
function reorder_pkgs() {
typeset -a pkglist=( ${pkgs[@]} )
pkgcnt=0

while read order_pkg ; do
for i in ${pkglist[@]} ; do
[ "$order_pkg" = "${i%.i}" ] && {
pkgs[$(( pkgcnt++ ))]="${i}"
printf "."
}
done
done < ${ORDER}
}

#
# This function builds a list of packages in a cluster
# If there is a cluster within a cluster, it will call itself to
# resolve all the packages.
#
# Before calling make sure you initialize pkgcnt to 0
# Arg: $1 contains the cluster name
# Affected vars: pkgs, pkgcnt
#
function get_pkg_list() {
local IFS="="
local print_on=0
local cluster=$1

while read arg1 arg2
do
[ "${arg1}" = "END" -a "${print_on}" = "1" ] && break;
[ -z "${arg2}" ] && continue;

[ "${arg2}" = "${cluster}" ] && {
print_on=1
continue
}

[ "${print_on}" = "1" -a "${arg1}" = "SUNW_CSRMEMBER" ] && {
# Test to see if package/cluster is on the exclude list
check_install "${arg2}" || continue

ifcluster=`expr "${arg2}" : '\(SUNWC\)'`
if [ "${ifcluster}" = "SUNWC" ]; then
get_pkg_list ${arg2}
else
[ -d ${PROD}/${arg2} ] && {
pkgs[$(( pkgcnt++ ))]="${arg2}"
printf "."
continue;
}

arg2="${arg2}.i"
[ -d ${PROD}/${arg2} ] && pkgs[$(( pkgcnt++ ))]="${arg2}"
printf "."
fi
}
done < ${TOK}
}

#
# Check for the installation image before proceeding
#
[ ! -d ${PROD} ] && {
echo "Cannot find Solaris Installation"
exit 1
}

#
# Create a pkg admin file - see man admin(4)
#
sed 's/ask/nocheck/' /var/sadm/install/admin/default > /tmp/.admin.doit

#
# Build an ordered list of packages from the Solaris installation image
#
printf "Building a list of packages "
pkgcnt=0
get_pkg_list ${METACLUSTER}
echo
# Add extra packages before sorting
add_extra_pkgs
echo
printf "Sorting packages into the correct order for installation "
reorder_pkgs
echo
#
# Try to create the filesystem you define at the beginning
#
case ${FS} in
"zfs") zfs create ${ROOTDEV} || {
echo "Cannot create zfs filesystem!"
exit 1
}
zfs set mountpoint=legacy ${ROOTDEV}
zfs set compression=${ZFSCOMPRESS} ${ROOTDEV}
;;
"ufs") newfs ${ROOTDEV} || {
echo "Cannot create ufs filesystem!"
exit 1
} ;;
*) echo "Cannot create ufs filesystem!"
exit 1
;;
esac

mount -F ${FS} ${ROOTDEV} /mnt || {
echo "Cannot mount filesystem!"
exit 1
}

#
# Install packages from Solaris installation image
#
echo "Starting installation of packages"
echo
(
for i in ${pkgs[@]} ; do
pkgadd -n -a /tmp/.admin.doit -d ${PROD} -R /mnt $i
done
) > ${LOG}

#
# Update /etc/vfstab with swap and root partitions
#
(
printf "${SWAP}\t-\t-\tswap\t-\tno\t-\n"
printf "${ROOTDEV}\t${RAWROOTDEV}\t/\t${FS}\t1\t${MOB}\t-\n"
[ "$FS" = "zfs" ] && {
mkdir -m 0755 /mnt/grub
printf "${GRUBFS}\t${GRUBFS/dsk/rdsk}\t/grub\tufs\t3\tyes\t-\n"
}
) >> /mnt/etc/vfstab

#
# Copy links for disk partitions in /dev/dsk and /dev/rdsk
# This is needed so the system can find the root partion on boot
#
( cd /dev && find dsk rdsk -depth | cpio -pdm /mnt/dev 2>/dev/null )

#
# Configure system to initialize identity on first boot
# If there is a sysidcfg file in the current directory. This will
# be copied across.
#
PROFILEDIR=/mnt/var/svc/profile
[ -f ${PROFILEDIR}/${SVCPROFILE} ] && {
if [ -f ./sysidcfg ]; then
cp ./sysidcfg /mnt/etc
else
touch /mnt/etc/.UNCONFIGURED
fi
cp -p ${PROFILEDIR}/${SVCPROFILE} ${PROFILEDIR}/generic.xml
}

#
# set bootpath to root filesystem.
# Also set the console to text
#
(
[ "${FS}" = "ufs" ] && {
BOOTPATH=$( ls -l ${ROOTDEV} | nawk '{print $11}' |
sed -e 's#[./]*/devices/#/#' )

printf "setprop bootpath ${BOOTPATH}\n"
}
printf "setprop console 'text'\n"
) >> /mnt/boot/solaris/bootenv.rc

#
# If found execute local script before /mnt is unmounted
#
[ -x ./local_install.bash ] && ./local_install.bash

#
# Finish off installation
#
[ -f /etc/zfs/zpool.cache ] && {
cp -p /etc/zfs/zpool.cache /mnt/etc/zfs
echo "etc/zfs/zpool.cache" >> /mnt/boot/solaris/filelist.ramdisk
}

#
# Configure for a ZFS boot. At the moment you need a small UFS partition
# somewhere for grub.
#
[ "${FS}" = "zfs" ] && {
(
printf "rootfs:zfs\n"
printf "zfsroot:${ROOTDEV}\n"
) >> /mnt/etc/system
}

devfsadm -r /mnt
rm -f /mnt/reconfigure
bootadm update-archive -R /mnt

[ "${FS}" = "zfs" ] && {
cp /mnt/platform/i86pc/boot_archive ${ZFSBOOTARCHIVE}
cp -p /mnt/sbin/bootadm /mnt/sbin/bootadm.real
cat >/mnt/sbin/bootadm << EOM
#!/usr/bin/sh

/sbin/bootadm.real "\$@"
if [ "\$1" = "update-archive" -a -d /grub/boot/grub ]; then
/usr/bin/cp /platform/i86pc/boot_archive ${ZFSBOOTARCHIVE}
fi
exit 0
EOM
}

echo
echo "If you have not already, you will need to configure menu.lst to"
echo "boot this partition."

umount /mnt
# eject cdrom

Wednesday, July 19, 2006

How small can you make Open Solaris - Part 4

Below, I have posted the new version of the Quick and Dirty Solaris Installer. This version will install Open Solaris directly onto a zfs filesystem. It took me a little longer than expected as I was having problems with the create_ramdisk.ksh script which is used by bootadm. The problem is that create_ramdisk.ksh does a 'du' on every file it is copying to the ramdisk to calculate the space required. If the 'du' command is used on a compressed file, it will return the compressed size. The problem comes about when a UFS filesystem is used to create the boot_archive image. Files copied from a compress zfs volume, will soon fill the ramdisk.

Not being able to create a boot archive is not good. The current work around is to turn compression off for the root filesystem. In the next update, I will modify the script to start excluding packages, so we can see what is needed and what is not.


#!/bin/bash
#
# Quick and Dirty Solaris installer
# Version 0.2
#
PROD=/cdrom/sol_11_x86/Solaris_11/Product
# PROD=/a/Solaris_11/Product
TOK=${PROD}/.clustertoc
ORDER=${PROD}/.order

# Current metaclusters are
# SUNWCXall - Entire Distribution plus OEM support
# SUNWCall - Entire Distribution
# SUNWCprog - Developer System Support
# SUNWCuser - End User System Suppor
# SUNWCreq - Core System Support
# SUNWCrnet - Reduced Networking Core System Support
# SUNWCmreq - Minimal Core System Support
METACLUSTER=SUNWCrnet

SWAP=/dev/dsk/c0d0s1
LOG=/tmp/install.log
SVCPROFILE=generic_limited_net.xml

#
# Change these to relect your system.
#
FS=zfs
[ "${FS}" != "zfs" -a "${FS}" != "ufs" ] && {
printf "Unknown filesystem ${FS}\n"
exit 1
}

[ "${FS}" = "zfs" ] && {
ROOTDEV=intdisk/snv43_zfs
RAWROOTDEV=-
GRUBFS=/dev/dsk/c0d0s0
ZFSBOOTARCHIVE=/grub/boot/boot_zfs
ZFSCOMPRESS=off
# ZFSCOMPRESS=on
MOB=yes
}

[ "${FS}" = "ufs" ] && {
ROOTDEV=/dev/dsk/c0d0s3
RAWROOTDEV=${ROOTDEV/dsk/rdsk}
MOB=no
}

typeset -a pkgs

#
# Solaris packages need to be installed in the correct order.
# The .order file contains all the packages in the correct
# installation order
#
function reorder_pkgs() {
typeset -a pkglist=( ${pkgs[@]} )
pkgcnt=0

while read order_pkg ; do
for i in ${pkglist[@]} ; do
[ "$order_pkg" = "${i%.i}" ] && {
pkgs[$(( pkgcnt++ ))]="${i}"
printf "."
}
done
done < ${ORDER}
}

#
# This function builds a list of packages in a cluster
# If there is a cluster within a cluster, it will call itself to
# resolve all the packages.
#
# Before calling make sure you initialize pkgcnt to 0
# Arg: $1 contains the cluster name
# Affected vars: pkgs, pkgcnt
#
function get_pkg_list() {
local IFS="="
local print_on=0
local cluster=$1

while read arg1 arg2
do
[ "${arg1}" = "END" -a "${print_on}" = "1" ] && break;
[ -z "${arg2}" ] && continue;

[ "${arg2}" = "${cluster}" ] && {
print_on=1
continue
}

[ "${print_on}" = "1" -a "${arg1}" = "SUNW_CSRMEMBER" ] && {
ifcluster=`expr "${arg2}" : '\(SUNWC\)'`
if [ "${ifcluster}" = "SUNWC" ]; then
get_pkg_list ${arg2}
else
[ -d ${PROD}/${arg2} ] && {
pkgs[$(( pkgcnt++ ))]="${arg2}"
printf "."
continue;
}

arg2="${arg2}.i"
[ -d ${PROD}/${arg2} ] && pkgs[$(( pkgcnt++ ))]="${arg2}"
printf "."
fi
}
done < ${TOK}
}

#
# Check for the installation image before proceeding
#
[ ! -d ${PROD} ] && {
echo "Cannot find Solaris Installation"
exit 1
}

#
# Create a pkg admin file - see man admin(4)
#
sed 's/ask/nocheck/' /var/sadm/install/admin/default > /tmp/.admin.doit

#
# Build an ordered list of packages from the Solaris installation image
#
printf "Building a list of packages "
pkgcnt=0
get_pkg_list ${METACLUSTER}
echo
printf "Sorting packages into the correct order for installation "
reorder_pkgs
echo

#
# Try to create the filesystem you define at the beginning
#
case ${FS} in
"zfs") zfs create ${ROOTDEV} || {
echo "Cannot create zfs filesystem!"
exit 1
}
zfs set mountpoint=legacy ${ROOTDEV}
zfs set compression=${ZFSCOMPRESS} ${ROOTDEV}
;;
"ufs") newfs ${ROOTDEV} || {
echo "Cannot create ufs filesystem!"
exit 1
} ;;
*) echo "Cannot create ufs filesystem!"
exit 1
;;
esac

mount -F ${FS} ${ROOTDEV} /mnt || {
echo "Cannot mount filesystem!"
exit 1
}

#
# Install packages from Solaris installation image
#
echo "Starting installation of packages"
echo
(
for i in ${pkgs[@]} ; do
pkgadd -n -a /tmp/.admin.doit -d ${PROD} -R /mnt $i
done
) > ${LOG}

#
# Update /etc/vfstab with swap and root partitions
#
(
printf "${SWAP}\t-\t-\tswap\t-\tno\t-\n"
printf "${ROOTDEV}\t${RAWROOTDEV}\t/\t${FS}\t1\t${MOB}\t-\n"
[ "$FS" = "zfs" ] && {
mkdir -m 0755 /mnt/grub
printf "${GRUBFS}\t${GRUBFS/dsk/rdsk}\t/grub\tufs\t3\tyes\t-\n"
}
) >> /mnt/etc/vfstab

#
# Copy links for disk partitions in /dev/dsk and /dev/rdsk
# This is needed so the system can find the root partion on boot
#
( cd /dev && find dsk rdsk -depth | cpio -pdm /mnt/dev 2>/dev/null )

#
# Configure system to initialize identity on first boot
# If there is a sysidcfg file in the current directory. This will
# be copied across.
#
PROFILEDIR=/mnt/var/svc/profile
[ -f ${PROFILEDIR}/${SVCPROFILE} ] && {
if [ -f ./sysidcfg ]; then
cp ./sysidcfg /mnt/etc
else
touch /mnt/etc/.UNCONFIGURED
fi
cp -p ${PROFILEDIR}/${SVCPROFILE} ${PROFILEDIR}/generic.xml
}

#
# set bootpath to root filesystem.
# Also set the console to text
#
(
[ "${FS}" = "ufs" ] && {
BOOTPATH=$( ls -l ${ROOTDEV} | nawk '{print $11}' |
sed -e 's#[./]*/devices/#/#' )

printf "setprop bootpath ${BOOTPATH}\n"
}
printf "setprop console 'text'\n"
) >> /mnt/boot/solaris/bootenv.rc

#
# If found execute local script before /mnt is unmounted
#
[ -x ./local_install.bash ] && ./local_install.bash

#
# Finish off installation
#
[ -f /etc/zfs/zpool.cache ] && {
cp -p /etc/zfs/zpool.cache /mnt/etc/zfs
echo "etc/zfs/zpool.cache" >> /mnt/boot/solaris/filelist.ramdisk
}

#
# Configure for a ZFS boot. At the moment you need a small UFS partition
# somewhere for grub.
#
[ "${FS}" = "zfs" ] && {
(
printf "rootfs:zfs\n"
printf "zfsroot:${ROOTDEV}\n"
) >> /mnt/etc/system
}

devfsadm -r /mnt
rm -f /mnt/reconfigure
bootadm update-archive -R /mnt

[ "${FS}" = "zfs" ] && {
cp /mnt/platform/i86pc/boot_archive ${ZFSBOOTARCHIVE}
cp -p /mnt/sbin/bootadm /mnt/sbin/bootadm.real
cat >/mnt/sbin/bootadm << EOM
#!/usr/bin/sh

/sbin/bootadm.real "\$@"
if [ "\$1" = "update-archive" -a -d /grub/boot/grub ]; then
/usr/bin/cp /platform/i86pc/boot_archive ${ZFSBOOTARCHIVE}
fi
exit 0
EOM
}

echo
echo "If you have not already, you will need to configure menu.lst to"
echo "boot this partition."

umount /mnt
# eject cdrom

Sunday, July 16, 2006

How small can you make Open Solaris - Part 3

In this post I have decided to focus on installing Solaris directly without using the normal Solaris installer. If you are like me and forever installing the latest Solaris Nevada version, and would like to do this without rebooting and running the Solaris installer then the script below might be for you. If you are playing around to see what packages you can remove to shrink a Solaris installation, then a modified version of this script might be what you are looking for.

Solaris already has an excellent tool for upgrading your current system called live upgrade. The script below is different, as it does a totally fresh install. Hopefully the next version will also install directly onto a ZFS filesystem rather than a UFS partition.

To start off you need a Solaris X86 system (one which uses grub) with a free Solaris partition that has enough diskspace to hold the installation. Note, this script does no checking! It assumes you have configured the settings correctly, and have allocated enough space. You also have to have a DVD (or image) of Solaris (or maybe a jumpstart type installation directory).

Going through the script, at the beginning you will find some variables to may need to change to refect your setup. Since it uses the Solaris Meta Clusters you will need to define which one you want. The functions (which is most of the code), rattle through the cluster table of contents file and produces an order list of packages for the meta cluster you selected. Once it has a list a pkgadd is executed for each package.

After the packages have been installed, there are a few system configurations and device links that are required to get the system booted correctly. You can have the system automatically configure itself on reboot if you have a valid sysidcfg file in your current directory. Otherwise, the system will ask you many questions on reboot to configure hostnames, networking etc. If you are familiar with building zones, this will not be new.

Before you start, you should read through the script so you know exactly what it does. The is very little error/sanity checks in the script, and incorrect settings could be devastating as you will need to be root (or have the correct privs) to use it. When you have finish the installation you will need to modify grub to boot off the correct partition. Below the relevent section of my /boot/grub/menu.lst file.


title Solaris Test
root (hd0,0,d)
kernel /platform/i86pc/multiboot
module /platform/i86pc/boot_archive


Enjoy, and Good Luck!!!


#!/bin/bash
#
# Quick and Dirty Solaris installer
# Version 0.1
#
PROD=/cdrom/sol_11_x86/Solaris_11/Product
TOK=${PROD}/.clustertoc
ORDER=${PROD}/.order

# Current metaclusters are
# SUNWCXall - Entire Distribution plus OEM support
# SUNWCall - Entire Distribution
# SUNWCprog - Developer System Support
# SUNWCuser - End User System Suppor
# SUNWCreq - Core System Support
# SUNWCrnet - Reduced Networking Core System Support
# SUNWCmreq - Minimal Core System Support
METACLUSTER=SUNWCrnet

#
# Change these to relect your system. The next version should
# use ZFS rather that UFS.
#
ROOTDEV=/dev/dsk/c0d0s4
RAWROOTDEV=/dev/rdsk/c0d0s4

SWAP=/dev/dsk/c0d0s1
LOG=/tmp/install.log

SVCPROFILE=generic_limited_net.xml

typeset -a pkgs

#
# Solaris packages need to be installed in the correct order.
# The .order file contains all the packages in the correct
# installation order
#
function reorder_pkgs() {
typeset -a pkglist=( ${pkgs[@]} )
pkgcnt=0

while read order_pkg ; do
for i in ${pkglist[@]} ; do
[ "$order_pkg" = "${i%.i}" ] && {
pkgs[$(( pkgcnt++ ))]="${i}"
printf "."
}
done
done < ${ORDER}
}

#
# This function builds a list of packages in a cluster
# If there is a cluster within a cluster, it will call itself to
# resolve all the packages.
#
# Before calling make sure you initialize pkgcnt to 0
# Arg: $1 contains the cluster name
# Affected vars: pkgs, pkgcnt
#
function get_pkg_list() {
local IFS="="
local print_on=0
local cluster=$1

while read arg1 arg2
do
[ "${arg1}" = "END" -a "${print_on}" = "1" ] && break;
[ -z "${arg2}" ] && continue;

[ "${arg2}" = "${cluster}" ] && {
print_on=1
continue
}

[ "${print_on}" = "1" -a "${arg1}" = "SUNW_CSRMEMBER" ] && {
ifcluster=`expr "${arg2}" : '\(SUNWC\)'`
if [ "${ifcluster}" = "SUNWC" ]; then
get_pkg_list ${arg2}
else
[ -d ${PROD}/${arg2} ] && {
pkgs[$(( pkgcnt++ ))]="${arg2}"
printf "."
continue;
}

arg2="${arg2}.i"
[ -d ${PROD}/${arg2} ] && pkgs[$(( pkgcnt++ ))]="${arg2}"
printf "."
fi
}
done < ${TOK}
}

#
# Check for the installation image before proceeding
#
[ ! -d ${PROD} ] && {
echo "Cannot find Solaris Installation"
exit 1
}

#
# Create a pkg admin file - see man admin(4)
#
sed 's/ask/nocheck/' /var/sadm/install/admin/default > /tmp/.admin.doit

#
# Build an ordered list of packages from the Solaris installation image
#
printf "Building a list of packages "
pkgcnt=0
get_pkg_list ${METACLUSTER}
echo
printf "Sorting packages into the correct order for installation "
reorder_pkgs
echo

newfs ${ROOTDEV} || exit 1
mount ${ROOTDEV} /mnt

#
# Install packages from Solaris installation image
#
echo "Starting installation of packages"
echo
(
for i in ${pkgs[@]} ; do
pkgadd -n -a /tmp/.admin.doit -d ${PROD} -R /mnt $i
done
) > ${LOG}

#
# Update /etc/vfstab with swap and root partitions
#
(
printf "${SWAP}\t-\t-\tswap\t-\tno\t-\n"
printf "${ROOTDEV}\t${RAWROOTDEV}\t/\tufs\t1\tno\t-\n"
) >> /mnt/etc/vfstab

#
# Copy links for disk partitions in /dev/dsk and /dev/rdsk
# This is needed so the system can find the root partion on boot
#
( cd /dev && find dsk rdsk -depth | cpio -pdm /mnt/dev 2>/dev/null )

#
# Configure system to initialize identity on first boot
# If there is a sysidcfg file in the current directory. This will
# be copied across.
#
PROFILEDIR=/mnt/var/svc/profile
[ -f ${PROFILEDIR}/${SVCPROFILE} ] && {
if [ -f ./sysidcfg ]; then
cp ./sysidcfg /mnt/etc
else
touch /mnt/etc/.UNCONFIGURED
fi
cp -p ${PROFILEDIR}/${SVCPROFILE} ${PROFILEDIR}/generic.xml
}

#
# set bootpath to root filesystem.
# Also set the console to text
#
(
BOOTPATH=$( ls -l ${ROOTDEV} | nawk '{print $11}' |
sed -e 's#[./]*/devices/#/#' )

printf "setprop bootpath ${BOOTPATH}\n"
printf "setprop console 'text'\n"
) >> /mnt/boot/solaris/bootenv.rc

#
# If found execute local script before /mnt is unmounted
#
[ -x ./local_install.bash ] && ./local_install.bash

#
# Finish off installation
#
bootadm update-archive -R /mnt
echo "You will need to configure /boot/grub/menu.lst to boot this partition"

umount /mnt
# eject cdrom

Wednesday, July 12, 2006

ZFS saved my backside

Well, not totally, but it did save me a re-install!!!

Last night I decided to stay up and watch the webcast of product lunch of Sun's new servers (Hmmm, I want a Thumper...). In Thailand the festivities did not start till 12:30 at night. The laptop I am using is an Acer Ferrari 4005. Recently, Sun released a sound driver for it (I was using OSS before), but whenever I used it there was sound + static.

Now I had 1 hour before the webcast, so I decided to do the usual rounds of the Open Solaris site to fill in time. I found that the latest ON build had a patch for the sound driver. Hmmm, there is not enough time for a compile (even on a Ferrari), so I downloaded the bfu archive, and started the install. If you read my earlier articles, you will probably know, that I am running on a ZFS root. Since this is a loosely undocumented feature, it does complicate a 'bfu'. I have done a bfu update on a ZFS root once before, so I should not have a problem, "Right?". The download took almost 1 hour (20Kb on a 4Mb link, grrr), so I quickly put the original bootadm back before I 'bfu'd'. Atfer the 'bfu', the usual 'acr', and then I replaced bootadm with the zfs modified bootadm and updated the archive. Just in time for a reboot right on the bell.

As I was rebooting, I was thinking, "I should have done a zfs snapshot before I started". It was late, and I was fully aware that a failure would only affect me. Ok, reboot. . . . Ahhhhh!!! Almost as I rebooted, the ferrari reset and booted again. Grrrr. I started to get 'Rhymes with MISSED'. I quickly edited grub to add the "-kd" option to the kernel line, and booted. Great :(, the error message was that it could not find the root partition. Ok, failsafe it is then.....

In failsafe, everything looked ok. I could mount the ZFS root. /etc/zfs/zpool.cache existed, and looked ok (Jibberish. It is binary after all). I then decided to look at grub, and found that the file /boot/solaris/filelist.ramdisk did not contain zpool.cache. Ah we now have somebody else to blame!!! The 'bfu/acr' procedure updates this file without considering that I may have added to it.

Right, what to do now. I tried to update the boot archive from both failsafe, and from a spare UFS root, and kept on getting a "filesystem full" error on the ramdisk. Now I have 1.5GB of memory and plenty of swap space, and I had modified the create_ramdisk script to double the amount of memory allocated. I still got the same message. "Marvelleous"

This all prompted a late night re-think. "I wish I had done a snapshot before I started the bfu....". Ok it was looking like a re-install of the root partition was on the cards. This was something for after the morning coffee... Hang on, after I installed and created a ZFS root, I did a snapshot to create a clone. That snapshot was still there! "zfs rollback intdisk/snv42_root@initial". Reboot. Hey, we are in business. I still have some driver/app installing and JDS update to do, but no Solaris install. Fantastic. So I quickly brought up firefox, and connected to the webcast, just to hear the last sentence. Oh, well I will fix the rest up in the morning. Now if I had only done a snapshot again before the 'bfu'!!!!

ZFS Rocks!!!
P.S. The time to do a ZFS snapshot is less time then thinking about it.

Saturday, July 08, 2006

How small can you make Open Solaris - Part 2

In the first part of the article I decided to use the failsafe miniroot as a base for a minimal Solaris. I also described how to mount and unmount a copy of the miniroot image which I termed microroot. You should be very comfortable with doing this before tackling this article. I will first start with a quick analysis of the miniroot before we start jump in and start removing files. It is really important to find out what is required by the operating system before you weild the 'rm' axe.

From playing around in the last article you would have realised that failsafe stops at the single user milestone. So it would be really nice to see what file SMF executes to get to the shell prompt. After you get the shell prompt we should look at what processes are running, and what dynamic libraries are in use. This will give us an idea of what needs to be kept.

Lets start - In normal Solaris as the root user create a directory called 'work' in your root partition. This directory will later be used to transfer files between the microroot and normal solaris. In that directory create a script "report.sh" from the following listing. Other than listing services, and processes. The horrible munge of shell scripting goes through the process table listing share libraries and objects used be each process. It then sorts and does a 'unique', to give you a nice listing of libraries and objects that will be required to keep.

mkdir /work
vi /work/report.sh

#!/sbin/sh

echo "Shared libraries in use"
echo "-----------------------"
(
/usr/bin/ps -eo pid | /usr/bin/awk '
$1 != "PID" {
printf "/usr/bin/pmap -x %s\n",$1
}' | /sbin/sh 2>/dev/null
) | /usr/bin/awk '{ print $7 }' | /usr/bin/grep '[.]so' | /usr/bin/sort -u

echo
echo "Services Status"
echo "---------------"
/usr/bin/svcs

echo
echo "Process Table"
echo "-------------"
/usr/bin/ps -eo user,pid,comm


Now we are ready to reboot and run the report. At the grub prompt you can either select the failsafe or the microroot option from part 1 as they should be identical at this point. Instead of pressing ENTER to boot press 'e' for edit. This should bring up a screen with the entries from your menu.lst file in the grub directory. Using the arrow keys move down to the line starting with 'kernel' and press 'e' again. At the end of the line add the '-m debug' option. This option will send 'smf' into debug mode filling your screen with information when you boot. Don't worry any changes made here as not save back to the menu.1st file. Press ENTER and then 'b' to boot.

During the boot smf should be printing out a huge amount of information. Don't panic if you can not speed read, the information you need is at the end. You will see that the last thing smf does is related to install-discovery.xml file. Later if you look at this file you will find it runs the script /sbin/install-discovery. This is a file of great interest and we will definitely attack it.

At this point you are being asked to mount your normal Solaris partition on /a. Answer 'y' for yes. Now run the report and put the results back into your work directory.

cd /
/sbin/sh /a/work/report.sh > /a/work/report.txt

This should not take long to run. When it has finished reboot in to normal Solaris and review the report.

In the report you should see around 36 libraries/objects listed. The process table contains a small number of processes running. They seem perfectly reasonable, so we will leave them alone. Now mount the microroot onto /mnt and run a the du command on it. Note: I have truncated the output.

du -ks * .??* | sort -nr
89502 usr
24961 kernel
10857 lib
2230 platform
1317 etc
1215 sbin
1156 boot
227 .tmp_proto

As you may expect, from the ouput of the du command you will see that /usr, /kernel, /lib, /platform are the biggest disk space users. Since /kernel and /platform are the Solaris Kernel we will leave these alone. At the end if you want a smaller image, there is plenty of scope for removing unused driver modules. As most people would expect, it will be /usr and /lib, that will first come under the knife. You might find it strange that /tmp is not empty, and there is a .tmp_proto directory. This is because the Solaris image you are building unlike Solaris on your system is not writeable. The microroot will make use of tmpfs filesystem for files which need to be written to. You can see this by doing a 'ls -l /mnt/etc/vfstab'. If you want to know more, just look through the install-discovery script.

Now the first time I cutting sections out, I just went in to a directory, had a look and if I did not think the file or directory was relevent I attacked it with the 'rm' command. Then I rebooted to test the changes. After doing this a few times, I thought it would be better to copy the commands into an editor so I could easily repeat the whole process from scratch if needed.

The next step is to copy /work/report.txt to /work/libs. Edit this file and remove everything except for the libraries and share objects. You will notice that some libraries end with .so and others end with '.so.1' etc. This is the version number of the shared library. With the editor remove the version from the end of the lines. Now all the lines should now end with '.so'. We will use the file later with fgrep to identify the libraries and links we want to keep.

It is now time to get to the fun part. Choose your weapon, 'as were going Hunting', er sorry - 'deleting'. I have listed the command so you can place them into a script at your leisure.

Lets start of with a no brainer and remove some obvious directories. No real space is gained yet but it may make you feel better...

cd /mnt
rm -rf /mnt/boot /mnt/cdrom /mnt/opt
Remove all locale's except for 'C' from /usr/lib/locale.
cd /mnt/usr/lib/locale && rm -rf [a-z]* POSIX
Now take the axe to anything that should not be required. Though make sure you leave anything related to devfsadm, sysevent, and booting the system alone. 'ls' does not need networking so while your at it take out anything network related. This is a script right! You can fix it later. ( The next part is not intended to look like uuencoding or base64. It just came out that way!)

cd /mnt/usr/lib && rm -rf zones zfs vplot term tabset t[0-9]* sunw,rcp spell
cd /mnt/usr/lib && rm -rf rcm print patch nss_nisplus.so.1 install ldap krb5
cd /mnt/usr/lib && rm -rf iconv inet crypto cron diff3prog diffh dns expreserve
cd /mnt/usr/lib && rm -rf netsvc newsyslog nfs nis nscd_nischeck nss_compat.so.1
cd /mnt/usr/lib && rm -rf nss_ldap.so.1 nss_nis.so.1 passwdutil.so.1 calprog
cd /mnt/usr/lib && rm -rf flash fp getoptcvt gmsgfmt help intrd lddstub libc
cd /mnt/usr/lib && rm -rf lp* lvm localedef lwp makekey more.help mps pt_chmod
cd /mnt/usr/lib && rm -rf fs/pcfs fs/fd fs/cachefs fs/nfs security/pam_krb5* drv
cd /mnt/usr/lib && rm -rf pam_dial* pam_ldap* pam_sample* platexec embedded_su
cd /mnt/usr/lib && rm -rf mdb smartcard sasl rsh kssladm ll* gss exrecover
cd /mnt/usr/lib && rm -rf nss_dns.so.1 abi adb class link_audit saf utmp* lib.b

Ok we need to copy the libraries we want to keep, and move them back once we have been brutal with 'rm'. It is now time to use the /work/libs file for input to fgrep. The logic of this script could be re-done, but it does the job for now.

cd /mnt/usr/lib
mkdir .bak
for i in `ls *[.]so*` ; do echo $i |fgrep -f /work/libs | cpio -pdm .bak 2>/dev/null ; done
rm -f *[.]so*
mv .bak/* .
rm -rf .bak

Move up one level and take the broad axe to /usr. I will leave /usr/bin and /usr/sbin to your discretion.

cd /mnt/usr && rm -rf X X11 adm ccs dict dt java kernel kvm mail net
cd /mnt/usr && rm -rf platform preserve pub sadm sfw share snadm spool
cd /mnt/usr && rm -rf news old openwin perl5 src xpg4 proc

For the second time use the /works/libs file, but this time on /lib.

cd /mnt/lib
mkdir .bak
for i in `ls *[.]so*` ; do echo $i |fgrep -f /work/libs | cpio -pdm .bak 2>/dev/null ; done
rm -f *[.]so*
mv .bak/* .
rm -rf .bak

The final directory we will attack before we build the image and reboot will be /sbin

cd /mnt/sbin && rm -rf rc* install* ifconfig ifparse getpart getmemory zpool
cd /mnt/sbin && rm -rf rc* biosdev sysid* meta* p* d* e* route* suninstall jsh
cd /mnt/sbin && rm -rf bootadm bpgetfile cleanup_hosts getInstallLangs swap*
cd /mnt/sbin && rm -rf getbootargs grepInstalledLocales hostconfig in.mpathd
cd /mnt/sbin && rm -rf mkmenu mountall netstrategy setup* selection siwrapper
cd /mnt/sbin && rm -rf soconfig umountall zfs zonename getconsole get_netmask

Now create 2 files. We need bootadm to just exit. The install-discovery file is just the original stripped down (remove the installation code) and compressed to fit in the blog.
vi /mnt/sbin/bootadm

#!/sbin/sh
exit 0

vi /mnt/sbin/install-discover

#!/sbin/sh
# Copyright 2005 Sun Microsystems, Inc. All rights reserved
# Use is subject to license terms.
SHELL=/sbin/sh;export SHELL
PATH=/sbin:/usr/bin:${PATH};export PATH
PLATFORM=`/sbin/uname -p`;export PLATFORM
_INIT_RECONFIG=set; export _INIT_RECONFIG #Dont know what this does
exec /dev/console 2>&1
/sbin/mount -F tmpfs swap /tmp
if [ $? -ne 0 ]; then
echo "tmpfs mount failed."
/sbin/sh
fi
( cd /.tmp_proto; find . -print -depth | cpio -pdm /tmp 2>/tmp/cpio.out )
echo "Memory free after tmpfs initialization: `/sbin/mem`"
echo "swap - /tmp tmpfs - no -" >> /etc/vfstab
echo "/proc - /proc proc - no -" >> /etc/vfstab
find dev -depth -print | cpio -pdum /tmp >/dev/null 2>&1
ln -sf /devices /tmp/devices
/sbin/mount -F lofs -O /tmp/dev /dev
mkdir -p /tmp/etc
mkdir -p /tmp/etc/sysevent&amp;amp;amp;amp;amp;&/usr/lib/sysevent/syseventd -r /tmp
/usr/lib/devfsadm/devfsadmd -r /tmp -p /tmp/root/etc/path_to_inst
eval `/sbin/get_root -t Roottype -b Rootfs /`
echo "${Rootfs} - / ${Roottype} - no ro" >> /etc/vfstab
echo
echo "Welcome to Super Small Solaris"
echo
echo "Dont expect your normal list of command."
echo "Just try 'ls /sbin' and you will find all you want."
echo
exec /sbin/sh


After just have made the scripts executable you should unmount /mnt and commit the changes to the image on /grub. The instructions below will backup the filesystem, and restore it onto a smaller UFS image. If you do not do this you will most likely find that your image has actually grown, as the UFS partition is the same size.

chmod 755 /mnt/sbin/bootadm /mnt/sbin/install-discovery
cd /
du -ks /mnt
# Take note of the size returned and add 15%
sync
ufsdump 0f /work/x86.microroot.dmp /mnt
umount /mnt
lofiadm -d /tmp/x86.microroot
mkfile k /tmp/x86.microroot
DEV="`lofiadm -a /tmp/x86.microroot`"
newfs -m 0 $DEV
mount -F ufs $DEV /mnt
cd /mnt
ufsrestore -rf /work/x86.microroot.dmp
rm restoresymtable
cd /
umount /mnt
lofiadm -d /tmp/x86.microroot
gzip -c /tmp/x86.microroot > /boot/x86.microroot

The last step we should do is to create an iso image to try out on a CD

cd /tmp
mkdir iso
cp -pr /boot /tmp/iso
rm /tmp/iso/boot/x86.miniroot-safe
rm /tmp/iso/boot/boot_archive

cat > /tmp/iso/boot/grub/menu.lst << EOM
default 0
timeout 10
splashimage /boot/grub/splash.xpm.gz
title Solaris Microroot
kernel /boot/multiboot kernel/unix -s
module /boot/x86.microroot
EOM

mkisofs -R -b boot/grub/stage2_eltorito -no-emul-boot boot-load-size 4 -boot-info-table -o /work/microroot.iso iso

At the end of all of this you should now have a 25 megabyte iso image in /work. For the moment I will stop here. Before I have gone further and been able to reduce the iso to 18 megabytes which still includes the full 32bit kernel from the latest Open Solaris builds. The next part of the article we will see if we can do the same thing using scripts, getting the files straight from an Open Solaris build. This involves just a little more work as we have to tackle smf and also build our own /devices and /dev directories.

Good Luck. Let me know how you go!

Friday, July 07, 2006

How small can you make Open Solaris - Part 1

Solaris started its life as operating system for workstations and then progressed to servers. It has always been an operating system dominated by features, showing Sun's R&D capability. This is great if you are installing a server or a desktop, but has far too many features for building an appliance. Luckily the installing comes with some reduced installation clusters which tries to bring the installation down to the bare minimum. Unfortunately the last time I looked the smallest install was still several hundred megabytes. Linux on the other hand has had a project going for while now called "Damn Small Linux", which strips Linux down to around 50 megabytes. This is a perfect base to start building an appliance, build your own distro, or strip the kernel down further for an embedded device.

Can Solaris become as small as "Damn Small Linux". The answer is a resounding yes (and probably smaller). Lets investigate how this can be done. The first thing to do is to state the goal, which is to be able to successfully boot into a shell and execute a simple command such as 'ls'. The logical place to start is with the smallest running version of Solaris supplied by Sun. If you have a x86 grub version of Solaris you will find a 52 megabyte file in your /boot directory called x86.miniroot-safe. This file is a gzipped UFS image that is booted when you select "Solaris failsafe" from the grub menu. Using it to boot to single user mode will mount the root filesystem and give you a root shell. It also contains the code to start a Solaris installation.

Now we have found an ideal candidate, lets start ripping it apart. The first step is to copy it (as you may need it if you break something), and setup new menu option in grub.

cd /boot
cp x86.miniroot-safe x86.microroot
cd /boot/grub

Edit the file menu.lst, copy the failsafe section and modify it to look something like this -

title Solaris Micro Root
kernel /boot/multiboot kernel/unix -s
module /boot/x86.microroot

If you want you now can reboot and select "Solaris Micro Root" when the grub menu comes up. It should boot into your copy of failsafe. After you have finished testing, reboot into multiuser mode and mount this image so you can change it.

The file '/boot/x86.microroot' is actually a gzipped UFS filesystem image, which with a couple of commands can mounted and change. The following is the an example of the procedure to make changes. I would suggest you create mount and unmount scripts to automate the process. (Note: You will need superuser privs for the following steps, the root user or atleast sys_mount, file_dac_read, file_dac_write)

Important - Make a backup before making changes, and document your changes

cp /boot/x86.microroot /boot/x86.microroot.bak

Unzip image to /tmp

gzcat /boot/x86.microroot > /tmp/microroot.img

Create a loopback device for this file. The environment variable 'dev' catches the device name for later use.

dev="`lofiadm -a /tmp/microroot.img`"

Finally mount the image using the loopback device.

mount -F ufs ${dev} /mnt

At this point your image is mounted and you can cd to /mnt and make your changes. Note: Be VERY VERY CAREFUL that you are changing or removing the file in /mnt and NOT in the root filesystem. It could get very ugly if you make this mistake. Take your time and be very careful. Once you have made the changes DONT REBOOT. You will need to follow the next steps and commit the changes before you reboot. Also rather than deleting a file or a directory, it is a better practice to move them to a backup directory and then test the changes. If the changes were good then you can delete the backup directory later. If the changes caused problems you can simply move the files and directories back in your next editting session.

Umount the image and delete the loopback device.

cd /
umount /mnt
lofiadm -d /tmp/microroot.img

Copy/gzip the changes back to the /boot directory

gzip -c /tmp/microroot.img > /boot/x86.microroot

You can now reboot and test your changes. If the system hangs, just reset the system and undo what you did and try something else. If the system reboots too fast for you to read the kernel messages, a handy tip is to add the "-kd" options in the grub menu (combined with or after the '-s'). This will put the kernel straight to debug mode. To continue the boot type ':c' at the prompt. If the kernel panics it should print a message and then wait for you to press a key before rebooting.

Using this method I reduced the entire image to around 42 megabytes (uncompressed) without touching the 32bit kernel. I then was able to create a 18 megabyte bootable Solaris. The next part I will list the areas you should remove or modify. Hopefully, in Part 3, I will have finished a script which takes a Open Solaris build from the prototype directory and builds the microroot for you.

Thursday, June 29, 2006

ZFS Root on Solaris Part 3

Hopefully, you have had success and you now have root on ZFS. In this next post I will describe how to get access to the root partition from failsafe. If you tried like me, you will find that you cannot see the zfs filesystems from failsafe. This is because the zpool.cache configuration file in /etc/zfs has not been copied onto the failsafe miniroot. One thing that is not wise at this point is to use the command "zfs import -f [pool_name]", to import the pool. While it may work, it will probably make your pool inaccessible when you next boot. So what you need to do is to copy the configuration file from your running system into the failsafe miniroot archive. Let' get into it. The steps are -

1) Boot ZFS root partition and get access to root

2) gunzip your miniroot image into /tmp

gzcat /grub/boot/x86.miniroot-safe > /tmp/miniroot.img

3) Using the loopback filesystem to mount the image on /mnt - replace the /dev/lofi/1 with appropriate returned device

lofiadm -a /tmp/miniroot.img
mount /dev/lofi/1 /mnt

4) Copy the file /etc/zfs/zpool.cache into /mnt/etc/zfs

cp -p /etc/zfs/zpool.cache /mnt/etc/zfs

5) unmount the filesystem

umount /mnt
lofiadm -d /tmp/miniroot.img

6) gzip the image back to grub.

gzip -c /tmp/miniroot.img > /grub/boot/x86.miniroot-safe

7) Edit the /grub/boot/grub/menu.lst and make sure that there is an identical "root" entry for failsafe and ZFS. Below are the entries for my machine.

#---------- ADDED BY BOOTADM - DO NOT EDIT ----------
title Solaris failsafe
root (hd0,0,a)
kernel /boot/multiboot kernel/unix -s
module /boot/x86.miniroot-safe
#---------------------END BOOTADM--------------------
title Solaris ZFS
root (hd0,0,a)
kernel /boot/multiboot
module /boot/boot_archive

8) You should now be able to reboot into failsafe and see your zfs pool with the "zpool list" command. Also you should be able to see your root filesystem with the "zfs list" command. When you booted, it would have asked you if you want to mount a filesystem on /a. If you answered yes, then unmount it with "umount /a". Now you can use the mount command to mount your ZFS root filesystem on to /a. Note, I previously named my root filesystem "intdisk/snv42_root". Change the command to suite your setup. Also don't forget to add the option "-F zfs". If you dont, mount will try NFS....

mount -F zfs intdisk/snv42_root /a

Finally you can now cd into /a and fix your system. Enjoy!!!

Wednesday, June 28, 2006

ZFS Root on Solaris Part 2

The first post was just a quick post to show people the commands to use to create a ZFS root filesystem, with a small UFS grub boot partition. This post is an update using a fresh solaris install with comments of what is happening. Most of the steps follow Tabriz's blog on blogs.sun.com.


First step is to do a clean Solaris install which I will leave out all the details except my setup used the following partitioning -


c0d0s0 /grub 150MB Location of the UFS grub boot code
c0d0s1 swap 1GB Standard Swap/Dump partition
c0d0s3 / 4GB Install/Upgrade UFS partition.
c0d0s7 ZFS all the rest later configured as "intdisk" pool, which contains
ZFS root & clones plus home directories etc


Once you have installed all of your standard software and configuration, now the fun begins.

Create a zfs pool, and turn off mounting filesystems by default

zpool create -f -m none intdisk c0d0s7

Create a zfs partition for the root filesystem. Being on a laptop, I am saving space by turning on compression. It may also give a performance boost at the same time. Note that I am using a symbolic name for the partition name. Later I am going to clone the filesystem and create a test environment.

zfs create intdisk/snv42_root
zfs set mountpoint=legacy intdisk/snv42_root
zfs set compression=on intdisk/snv42_root

Create a mountpoint for the zfs root and use ufsdump/ufsrestore to copy all of the UFS root filesystem. You could use cpio or tar, but you also want the data underneath /devices.

mkdir -m 0755 /zfsroot
echo "intdisk/snv42_root - /zfsroot zfs - yes -" >> /etc/vfstab
mount /zfsroot
cd /zfsroot
ufsdump 0f - / | ufsrestore -rf -

Configure /etc/system on zfs root to use the correct zfs filesystem. This configuation actually gets used by the kernel from within the boot archive, and gets copied each time you update the archive. Also make sure the zpool configuration gets added to the boot archive, and use a classic onliner to up the vfstab on the zfs root.

echo "rootfs:zfs" >> /zfsroot/etc/system
echo "zfsroot:intdisk/snv42_root" >> /zfsroot/etc/system
echo "etc/zfs/zpool.cache" >> /zfsroot/boot/solaris/filelist.ramdisk
grep -v 'intdisk/snv42_root' /etc/vfstab | awk '$3 == "/" { printf "intdisk/snv42_root\t-\t/\tzfs\t-\tno\t-\n" } ; $3 != "/" { print $0 }' > /zfsroot/etc/vfstab

Create a modified hack from Tabriz's blog to fix bootadm.

mv /zfsroot/sbin/bootadm /zfsroot/sbin/bootadm.real
cat - > /zfsroot/sbin/bootadm << EOM
#!/usr/bin/sh

/sbin/bootadm.real "\$@"
if [ "\$1" = "update-archive" -a -d /grub/boot/grub ]; then
/usr/bin/cp /platform/i86pc/boot_archive /boot/boot_archive
fi
exit 0
EOM

chmod +x /zfsroot/sbin/bootadm

Now we are ready to update the boot archive and configure grub. The "root (hd0,0,a)" should point to the grub partition.

/usr/sbin/bootadm update-archive -R /zfsroot
cp -pr /zfsroot/boot /grub
cp /zfsroot/platform/i86pc/boot_archive /grub/boot/boot_archive
(
echo "title Solaris ZFS"
echo "root (hd0,0,a)"
echo "kernel /boot/multiboot"
echo "module /boot/boot_archive"
) >> /grub/boot/grub/menu.lst

cd /grub/boot/grub
installgrub stage1 stage2 /dev/rdsk/c0d0s0

Now you can reboot, and if all goes well you should be able to use the new entry in the grub menu to boot into the zfs partition. Ok, if it all looks good, lets try to configure another zfs root partition using a clone. To create copy we simply do a snapshot and a clone of the current zfs root partition. I am naming this partition "snv42_test". Note the changes...

zfs snapshot intdisk/snv42_root@initial
zfs clone intdisk/snv42_root@initial intdisk/snv42_test
zfs set mountpoint=legacy intdisk/snv42_test
zfs set compression=on intdisk/snv42_test
echo "intdisk/snv42_test - /zfsroot zfs - yes -" >> /etc/vfstab
mount /zfsroot

Follow similar steps you used last time. Again note the changes I have done to the boot archive name and filesystem name.

sed -e "s/snv42_root/snv42_test/" /etc/system > /zfsroot/etc/system
grep -v 'intdisk/snv42_test' /etc/vfstab | awk '$3 == "/" { printf "intdisk/snv42_test\t-\t/\tzfs\t-\tno\t-\n" } ; $3 != "/" { print $0 }' > /zfsroot/etc/vfstab

cat - > /zfsroot/sbin/bootadm << EOM
#!/usr/bin/sh

/sbin/bootadm.real "\$@"
if [ "\$1" = "update-archive" -a -d /grub/boot/grub ]; then
/usr/bin/cp /platform/i86pc/boot_archive /boot/boot_archive.test
fi
exit 0
EOM

chmod +x /zfsroot/sbin/bootadm

/usr/sbin/bootadm update-archive -R /zfsroot
cp /zfsroot/platform/i86pc/boot_archive /grub/boot/boot_archive.test
(
echo "#"
echo "title Solaris ZFS test"
echo "root (hd0,0,a)"
echo "kernel /boot/multiboot"
echo "module /boot/boot_archive.test"
) >> /grub/boot/grub/menu.lst

Now you should be able to reboot, and you should now have a cloned test environment. Only changes from the original filesystem are added to the diskspace usage. The number of clones you can have is limited by the number of boot archives you can jam into the /grub partition, and to a lesser extent by the free space you have in the ZFS pool.

Have Fun!

ZFS root on Solaris

Below is the log of commands I used to have a ZFS root partition with a small ufs for grub. Before I was swapping between 2 root partitions for upgrade, with everything I wanted to keep between upgrades on a ZFS partition. This time I divided one of the root partitions into a small UFS partition, and the rest I used as a ZFS root.

The next step is a total backup (done), and repartition the disk into a small UFS boot partition for grub boots, a ~4GB ufs install partition, and the rest including root will be all on one ufs partition. On a laptop, the extra diskspace is very handy. I will also spend some time on writing better notes :-)


newfs /dev/dsk/c0d0s0
if [ ! -d /altroot ]; then
mkdir -m 0755 /altroot
fi
echo "/dev/dsk/c0d0s0 /dev/rdsk/c0d0s0 /altroot ufs 3 yes -" >> /etc/vfstab
mount /altroot
zpool create -m none rootdisk c0d0s4
zfs create rootdisk/root
zfs set mountpoint=legacy rootdisk/root
zfs set compression=on rootdisk/root

/etc/vfstab
rootdisk/root - /zfsroot zfs - yes -

cd /zfsroot
ufsdump 0f - / | ufsrestore -rf -

echo "rootfs:zfs" >> /zfsroot/etc/system
echo "zfsroot:rootdisk/root" >> /zfsroot/etc/system
echo "etc/zfs/zpool.cache" >> /zfsroot/boot/solaris/filelist.ramdisk
grep -v 'rootdisk/root' /etc/vfstab | awk '$3 == "/" { printf "rootdisk/root\t-\t/\tzfs\t-\tno\t-\n" } ; $3 != "/" { print $0 }' > /zfsroot/etc/vfstab
mv /zfsroot/sbin/bootadm /zfsroot/sbin/bootadm.real

cat - > /zfsroot/sbin/bootadm << EOM
#!/usr/bin/sh

/sbin/bootadm.real "\$@"
/usr/bin/cp /platform/i86pc/boot_archive /boot/boot_archive
exit 0
EOM

chmod +x /zfsroot/sbin/bootadm
/usr/sbin/bootadm update-archive -R /zfsroot
cp -pr /zfsroot/boot /altroot
cp /zfsroot/platform/i86pc/boot_archive /altroot/boot/boot_archive

(
echo "title Solaris ZFS"
echo "kernel /boot/multiboot"
echo "module /boot/boot_archive"
) >> /altroot/boot/grub/menu.lst

cd /altroot/boot/grub
installgrub stage1 stage2 /dev/rdsk/c0d0s0

Wednesday, June 14, 2006

Happy Bithday OpenSolaris

Get OpenSolaris Today, as some are aware is the first birthday of the OpenSolaris Community, therefore it is time to reflect on the substantial progress that has made and the future. To fill in the picture properly. I will digress a little to go back in time to the pre-Internet days when dinosaurs and BSD roamed the earth. Being a young system admin at a University the access to Solaris source code was obtainable, and easy to modify for our environment. Therefore to me Solaris was always open source. On many occasions, a quick flick through the files, you could quickly solve problems you were having. Insulated from the outside world, and connected to other University's all over the world (pre and post Internet) and having a regular update of source delivered on tapes, open source was always easily obtainable, and very useful.

Somewhere down the track Sun decided to jump camps from BSD to System V Unix. For what ever reasons they had, to me this is when Sun changed focus from a company supporting research by building cheap high performance workstations, to focus on the server business market. While, this change at the time prepared them for the first Internet age, it was a shift away from the largely open source BSD, to the very much proprietary System V.

Many years down the track, with the help of Wall Street growing the Internet much faster than it should, the Internet bubble went Bang! To add to Sun's problems of the times, Intel PC chips at the low end were becoming performance competitive with Sparc, and the IBM's Power PC was impressive at the top end. Like vultures surrounding the dying Sun corpse, Sun competitors, tried to deliver the knockout blow, by helping make the Open Source Linux kernel competitive with Sun's crown jewel Solaris. This is very ironic, as to this day these supporters of the Linux kernel are right up there next to Microsoft being the most closed source companies in the world. Sun could not win a trick. Behind the scenes, they were preparing to Open Source Solaris. They had brought software companies such as Staroffice, and turned them into Open Source, many parts of their software stack were being Open Source'd, but somehow the wider Open Source community label them as somehow “closed”.

Ok, one year ago to this day, Sun finally created the OpenSolaris community, and released large portions of Solaris as Open Source under a Mozilla type license. What does this do for Sun?

  • Sun is now the largest contributer of Open Source software (and hardware), on the planet bar none (something some slash dotter-er's may never accept). It is now difficult (or biased) to label Sun as a “closed” company.

  • Solaris is now certified to run on 100's of platforms. This is true remarkable as it is an order of magnitude greater, than supported Linux platforms. Now with Solaris on non Sun platforms, not being treated like a second class citizen, and with Sun having an extremely good x86 product line. It now gives Sun the potential to reach out to new customers, that they could have on dreamed about having a couple of years ago.

  • Through the Open Solaris community, Sun is showing just how transparent and open the company really is. Many of Sun critics over the last year since the launch of Open Solaris and now Open Sparc, are either changing their opinions of Sun openness, focusing on other areas such as Java Open Sourcing, or are plainly showing that they have vested interests which are always will be opposed to Sun.

  • Just looking through the Solaris source code shows that it is extremely well written and structured, which shows the reason why Solaris has the reputation for being reliable and secure. It also show, that their developers are extremely talented and well organized. This demonstrates to customers, that Sun are not only able to deliver a good product, but it also has the ability to support it.


Now, more importantly, what has this done for the wider community?

  • It is now much easier for individuals and companies to write device drivers for their product. This was always one area (along with Sun dropping it for awhile) for Solaris on non Sparc systems, that always had major problems.

  • Other Open Source community's such as BSD, can now implement advanced Solaris features such as dtrace, zfs etc into their codebase. Ironically since the Linux kernel license is more restrictive than that of BSD, they have less freedom to do this in Linux.

  • From personal experience in the OpenSolaris over the year, I can testify that Sun is by a long way the most transparent company I have ever seen. While they are tight lipped over the release schedule and configuration for some of their hardware (which is understandable), you can quite easily go into their software forums, and not only see their internal debates, but you can also put your 2 cents worth in and it is taken quite seriously.

  • I read the other day, that Dell is now very please with its ability to field 90% support calls for Linux without having to run off to Red Hat (which they then may pass onto a public mailing list). After laughing for 5 minutes I picked myself up off the floor and thought about it. For many years I would generally spend a fair amount of time looking into the problem (to make sure it wasn't something I did), before I would lodge a support call to Sun or others. Generally the amount of time and effort to go through the support channels for what was most likely a trivial/stupid problem or a known bug, was greater than solving the problem myself. Now days, many simple annoying problems are just a matter of just writing to the relevant Sun's forums. Not only do you get a quick response, you will generally get a response back one of the developers (or fellow external community member), who is only too pleased to help. If you are a customer, this is great for the 90% Dell type questions, for which you need a simple quick answer. For Sun it not only lets their developers know exactly the problems the customers are having, but it helps free up the support network to focus on more difficult support issues.

  • When not at work, for many years at home I have survived with my primary desktop being a Solaris machine such as a Sun Blade 100 with a Sun PCI card for windows apps, and a linux development box somewhere on the floor. Now days, I survive of 1 laptop with 2 100GB discs (physically swapped). On one disc I have one partition with Windows which came as a Microsoft tax, and Solaris running in vmware. The other partition is Linux, which currently is the latest Ubuntu. Most people for the day to day work, as I did in the past would use this disk. Today, it is just spends its life sitting in a draw, and I now prefer to use the other disc which is 100% OpenSolaris. Many of the reasons I would use the other disk are now gone. The effort in swapping disks is greater than the value of change. From a pure user experience, there is little difference between running on JDS/Gnone Solaris than Gnome Linux. Of course two standout point here is the lack of 3D graphics support for ATI cards is non existent, so most good games (other than freeciv) are out. Also, the ability to pull packages down through apt-get is surely missed. Running one of the other OpenSolaris distro's would close this gap. The benefits for a developer on Open Solaris with the combination of features such as ZFS, dtrace, zones, Sun Studio, Netbeans, etc are very significant.


Now we have made it to the first birthday of the Open Solaris community, what are some of the more interesting things planned in Open Solaris for the future? and what I would like to see. I will do my top ten.

    1) Xen – Solaris on Dom0, would be very nice. Having relatively limited resources on my laptop, products such as VMware are still a bit heavy to emulate a datacenter. The combination of Xen, and Zones, go a long way to solving this problem.

    2) Trusted Solaris Extentions – Anybody who works with me knows that I do put great importance on security, and control on who can do what on a system. While Solaris 10 has many of the features migrated from Trusted Solaris, it lacks the whole security from day one approach that Trusted Solaris has. Today Solaris 10 has many things turned on by default to make life easy for a Sysadmin. Trusted Solaris takes the hardline approach where the default profile, makes a Sysadmin think about what should be turned on, who should have access to what, and eithen who should talk to who.

    3) ZFS boot – While I have a great fondness of ufs, especially its ability in Solaris 10 to mirror disks from jumpstart, zfs is just so much more efficient. The only fix barrier is the number and size of disks I have attached. Live upgrade into a snapshot would be a major win, and very cool.

    4) Sparks etc projects – Coming from a background of a large University environment, LDAP directories, authentication, authorization, provisioning a user, can be difficult. While Sun Identity and Directory products are technically superior to other products on the market, they do no easily transform to the desktop. The Active Directory and Windows desktop combination when setup by a good sysadmin, is superior to anything Solaris and Linux has. The level, of control direct control of the desktop from the directory is impressive, when implemented properly. An advancement is this area is greatly needed for the Unix/Linux world.

    5) JDS – Vermillion is coming on very nicely, and is catching quickly up to the commercial Linux Gnome implementations. The sooner CDE is shown the door the better.

    6) Device Drivers – Great inroads have been made into the one big advantage that Windows and Linux have had over Solaris x86 (Sparc always had good support for its hardware). While you would class it as a small subset of drivers, it mostly covers all of the important ones. Solaris actually has an advantage in the future here, as it has much less baggage of old drivers). On my laptop, the only thing I am really missing that I want is a good 3D driver for the graphics card. The others just don't affect me working day to day, and I rarely ever used them under the other OS's. For mobile phone junkies bluetooth would be nice.

    7) Porting Solaris to other devices – I would feel safer if my mobile phone (or my fridge) was running Solaris rather than something from Microsoft, or eithen Linux. While it is not a big thing, as long as the phone (or fridge) does its function, I am generally pleased....

    8) GNU – While GNU/Solaris pre dates GNU/Linux, the creation of such a beast with the latest versions has been up to the owner. Having the GNU binaries and libraries in known configuration, makes it easier port other applications across. Also having up to date versions would negate the need to uninstall the delivered version or compile the newer version for another directory and have dynamic library madness.

    9) Patching and Packaging – This is can be a painful area for sysadmins. Adding zones while in theory makes life easier, currently in practice, it can be troublesome. Now that the package tool source code has been released, it may spur the community to come up with some alternatives and tools.

    10) Games – How could I leave this to last :-) Solaris has been lacking in this area for a long time. This is due to the fact that before, buying a Sparc machine with a 3D graphics card, would cost you a lot of money, as you were really buying something designed as a workstation. Only that Nvidia and ATI, now offer the closed source 3D driver that Linux is a viable platform for games. Having a ATI graphics card in my laptop, I cannot eithen run Java games which need 3D drivers. For the moment it is freeciv (the latest beta is really nice). I hope ATI will release a driver before the next birthday.


To summarize, all in all to extend an Australian term “Its been a bloody good year for OpenSolaris”, lets make 'every' next year better!!!

P.S. A special thanks to Scott, Jonathon, and the team for the leadership to create OpenSolaris.