This is a technique for failing over a pair of nics on a Solaris 10 machine. I am told this will only work with Sun nics that come with Sun hardware, but I haven’t tested it.
You need to be root for all of the following, I am also assuming your going to be able to reboot the machine once this setup is in place.
First make sure you have your hostname in your /etc/hosts file
[ck3k@ra ~]$ cat /etc/hosts
#
# Internet host table
#
127.0.0.1 localhost
10.10.1.112 ra loghost
Ok, so in your /etc/hosts ra is your hostname. Now we need to edit/create two files. NOTE the .dmfe* will be the interface name, for example on a SunFire T2000 it is hostname.ipge*.
First setup your main nic’s hostname file (This should already exist in your install)
[ck3k@ra ~]$ cat /etc/hostname.dmfe0
ra netmask + broadcast + group failover up
So you see your hostname and it’s ip info, but now it has a group called “failover.” This creates a group of nics that will failover for one another.
Now create the hostname file for your second nic.
[ck3k@ra ~]$ cat /etc/hostname.dmfe1
group failover up
This just adds the nic to the group for failover. Now, reboot your system and the settings will take affect on boot. Keep in mind this will work for virtual interfaces as well, so all of your Zones will also failover.
Once your system reboots, the best way to test is to be on the LOM via serial and to pull one of your two nics out of your switch.
Here some of the output of the nic going down :
Aug 16 17:54:28 ra dmfe: [ID 801593 kern.notice] NOTICE: dmfe0: PHY 1 link down
Aug 16 17:54:28 ra in.mpathd[135]: [ID 215189 daemon.error] The link has gone down on dmfe0
Aug 16 17:54:28 ra in.mpathd[135]: [ID 594170 daemon.error] NIC failure detected on dmfe0 of group failover
Aug 16 17:54:28 ra in.mpathd[135]: [ID 832587 daemon.error] Successfully failed over from NIC dmfe0 to NIC dmfe1
Aug 16 17:55:05 ra dmfe: [ID 801593 kern.notice] NOTICE: dmfe0: PHY 1 link up 100 Mbps Full-Duplex
Here it is coming back up :
Aug 16 17:55:05 ra dmfe: [ID 801593 kern.notice] NOTICE: dmfe0: PHY 1 link up 100 Mbps Full-Duplex
Aug 16 17:55:05 ra in.mpathd[135]: [ID 820239 daemon.error] The link has come up on dmfe0
Aug 16 17:55:05 ra in.mpathd[135]: [ID 299542 daemon.error] NIC repair detected on dmfe0 of group failover
Aug 16 17:55:05 ra in.mpathd[135]: [ID 620804 daemon.error] Successfully failed back to NIC dmfe0
You now have a redundant Solaris machine that can resist some layer one failures.