FAQ and Support¶
Support¶
For issues regarding OMF, please subscribe to our mailing list at https://lists.nicta.com.au/wws/subscribe/omf-user and do not hesitate to post your question with a detailed error description. Please include any log files, configuration files, SQL dumps or stack traces in your post.
Also please check our issue tracker (click "Issues" above) whether your problem has already been reported and/or fixed. If you are registered on this site, feel free to add a new issue.
OMF 5.2 Installation FAQ¶
How do I upgrade from a previous OMF installation?¶
In some cases the quickest way might be to start off with a fresh Ubuntu installation. If you do want to upgrade, roughly follow these steps:- Upgrade your Ubuntu to the latest release (
sudo do-release-upgrade) - Remove any previous OMF installation. Delete the
/optdirectory if you have it. Rundpkg -l | grep omfto find out which old packages are installed. - Back up your omf configuration files in
/etc - Install the latest OMF by following the installation guide above
- Do not just copy over the old config files, as their syntax might have changed.
- Carefully merge your old inventory into the new inventory schema. It might be easier to recreate the inventory using your SQL sample import file.
Where are the packages for the old OMF release?¶
We are no longer offering Debian packages of OMF 5.1 (1.3) in our repository. You can however build them yourself easily:
svn co http://svn.mytestbed.net/OMF/branches/release-1.3
and then run make in the gridService, nodeAgent and nodeHandler directories. The debian packages will be created in the build directory.
"omf save" fails!¶
Sorry, you cannot save files as the superuser (root).
Do the nodes always have to boot from PXE first, even when an image is installed on the HDD?¶
Yes. If the nodes are supposed to boot from their internal disk, we instruct them to do so via PXE. If you disable PXE in the BIOS, OMF cannot save/load images anymore, since it has to boot from the network for this.
"omf load" fails and in the EC logfile I can see the message '"opening output file: No such device or address"'¶
Check your inventory, table testbeds, column frisbee_default_disk and put the correct hard drive device there (such as /dev/sda). You can telnet into the node in PXE mode and find the device name by running "fdisk -l".
What's the password to log in to your PXE and baseline image?¶
Username is "root", password is "voyage" for the baseline image. The PXE image requires no password on telnet logon.
I get the error "node xyz is not enrolled in '_ALL_' yet."¶
This is not an error. You need to wait until the node signs in. If it doesn't, check if the RC on the node is running and if it can connect to the Openfire server.
Openfire is running, but the EC and RC cannot connect to it¶
Check the firewall settings on the openfire machine. The ports 5222 and 5223 TCP should be open. We've also had an issue with Java on IPv6 enabled machines: openfire would only bind IPv6 sockets, but no IPv4 sockets. Some JVM apparently uses IPv6 by default for socket binding. The solution here was to disable IPv6 in the Linux kernel by passing "ipv6.disable=1" to the kernel command line.
If your testbed has 100+ nodes, openfire might run into the open files limit. On most Linux systems, a user can open up to 1024 files, UNIX or network sockets. If this limit is reached, openfire cannot establish any new connections and the EC will hang while trying to connect. Openfire reports "Too many open files" in its error.log. The solution is to add the command "ulimit -Hn 10000" to the file /etc/init.d/openfire right before the start-stop-daemon line. This will set the hard limit of open files for the openfire user to 10000.
With more than 100 nodes we also recommend to use mySQL as the database for openfire, since we ran into some lockfile issues with openfire's internal database engine. Using mySQL makes it easier to debug openfire problems as well, as it is easy to browse the node and message tables openfire uses in mySQL.
We've also experienced an issue that some RC's would hang when connecting to openfire. A tcpdump shows that they do retry to connect periodically. Turning off SSL/TLS in Openfire (Server Settings - Security Settings- Client .. Custom, select N/A for both methods) resolved the problem.
If openfire's logfile reports "out of memory", it might be worthwile to pass the VM memory limit to the Java VM. Add "-Xms512m -Xmx512m" to the DAEMON_OPTS in /etc/default/openfire.
The RC complains that it cannot find 'ifconfig', 'iwconfig', 'wlanconfig', 'iwpriv' or some other tool.¶
Make sure the wireless tools are located in the default path '/sbin'. If you have them in a different path, please copy or symlink them to '/sbin'.
I can't get it to work! I swear I followed your install guide and double checked my configuration, but it still doesn't run!¶
Don't panic! We are happy to help. Drop us a line at christoph.dwertmann@nicta.com.au or thierry.rakotoarivelo@nicta.com.au. Don't forget to include your EC, RC and AM logfiles and config files as well as a SQL dump of your inventory.