Solaris pbulk problems HowTo

From NetBSD Wiki

Jump to: navigation, search

Contents

General

pbulk is a bulk build framework for NetBSD pkgsrc. It has been developed by Joerg Sonnenberger as part of a Google Summer of Code project in 2007. The code of the tools itself are written in the C programming language - the several parts are controlled with shell script wrappers. The name pbulk stands for: parallel bulk framework.

The general concept behind pbulk is to use the existing toolchain of pkgsrc and to extend or replace certain phases. The full framework for running bulk builds consists of two seperated toolchains. The outer toolchain referenced as pbulk toolchain controls the build, cares for dependendancies, build order and report creation. The inner toolchain referenced as pkgsrc toolchain is the one who actually builds, add and remove depended packages which are required for the build.

The main advantage of pbulk is that parallel bulks can be run with a client server based architecture.

Warning: Even as in the old build framework which is located at pkgsrc/mk/bulk - pbulk will remove all your installed packages from the system. This is not a design goal - it's to ensure that all packages can be built in the correct order.

A special section "Building in a sandbox" in this document describes how to deal with this situation.

Note: Check your environment before you start using pbulk! All the packages will fail to build if you have a PKG_PATH set in your environment. Make sure to unset it.

  unset PKG_PATH

Note: The setup on non-NetBSD platforms may be different from the described section. Make sure to read the section "Issues/Changes on non-NetBSD platforms" for further information.


Setup the pbulk toolchain (outer toolchain)

This part of the toolchain controls the build, cares for build order, generates the report. In order to keep a consistent layout the following structure is recommended:

      /usr
      |
      +- /pkg_bulk               root of the pbulk toolchain
      |    |
      |    +- bulklog            place to create the reports
      |
      |
      +- pkgsrc                  root of pkgsrc toolchain
         |
         +- packages             place for the packages
         |
         +- distfiles            place for the distfiles
      

The first step is to bootstrap pkgsrc in order to setup the outer toolchain.

     cd /usr/pkgsrc/bootstrap;
        ./bootstrap --workdir=/tmp/pbulk \
               --prefix=/usr/pkg_bulk \
               --pkgdbdir=/usr/pkg_bulk/.pkgdb


After the bootstrapping is done, the workdirectory in /tmp/pbulk can be removed safely.

      rm -rf /tmp/pbulk

The next step is to build pbulk itself. Make sure to use the toolchain which has just been built in /usr/pkg_bulk.

      cd /usr/pkgsrc/pkgtools/pbulk && \
         /usr/pkg_bulk/bin/bmake install clean clean-depends


If other tools are needed such as misc/screen they can be builded like the pbulk binaries above. The packages will be installed inside the pbulk outer toolchain. This way they will be kept when starting pbulk as pbulk will use the inner toolchain for building. For the tool screen it would look like this:

      cd /usr/pkgsrc/misc/screen && \
         /usr/pkg_bulk/bin/bmake install clean clean-depends


If you would like to send out bulk reports to pkgsrc-bulk@NetBSD.org make sure to have a working smtp server on the server. If just a small smtp send daemon is needed with smtp auth support the mail/esmtp package might be a big help. However, it pulls in bison, m4, gmake and other things.


Configure the outer toolchain

First some configuration work on the mk.conf needs to be done. Remember: the correct file is /usr/pkg/etc/mk.conf as we are working in the outer toolchain right now. Here is a fragment which needs to be added:

   cd /usr/pkgsrc/bootstrap && \
       --workdir=/tmp/bootstrap \
       --gzip-binary-kit=/usr/pkgsrc/bootstrap/bootstrap.tar.gz \
       --mk-fragment=/root/mk-fragment.conf
   rm -rf /tmp/bootstrap


mk-fragment.conf:

     ---------------------------------------------------
     FAILOVER_FETCH=           yes
     X11_TYPE=                 modular
     _ACCEPTABLE=              yes
     ALLOW_VULNERABLE_PACKAGES=yes
     PKG_DEVELOPER?=           yes
     WRKOBJDIR=                /tmp/pbulk
     ---------------------------------------------------

Remark: The WRKOBJDIR is the directory where all the build work happens. Tarballs will be extracted here, buildlink runs against this filesystem so it's pretty IO-depended. If you can afford enough memory a tmpfs memory filesystem will speed up your build drastically. This might lead to a problem with packages like openoffice2 which needs several gigabyte of space for the build.

Remark: It's ok to put the DISTDIR and the PACKAGES on a NFS shared filesystem. It's required if you running the builds on multiple nodes.

Note: The separation WRKOBJDIR is needed in parallel build environments as the single host might try to access the same work directories.

Remark: Never run the bulkbuild, scan script _WITHOUT_ setting up the /usr/pkg/etc/mk.conf file.


The next file which needs configuration work is the pbulk configuration file. The following example is just a minimal example for a possible pbulk.conf. This file is a little bin lengthy, so only the important details are explained here.

for all systems:

   - master_mode=no
   - base_url=<url to the report>

for NetBSD systems:

   - pkg_install_prefix=/usr
   - bootstrapkit=
   - make=/usr/bin/make

for non-NetBSD systems:

   - pkg_install_prefix=/usr/pkg_bulk
   - bootstrapkit=
   - make=/usr/pkg/bin/bmake


If you want to build only a set of packages the variable limited_list might be helpful. The limited_list parameter is a file with one package per line. The location of the packages is given in the common package notation eg. meta/gnome-base.

Starting the pbulk

The build runs throqugh several build stages. The first stage is all preparing the build. It will clean up things, remove existing packages.

In the next stage pbulk will scan the pkgsrc tree for Makefiles and build a dependency tree out of it. On a slow machine like a Sun Ultra 10 this might take up to 12 hours. On a midclass Pentium 4 machine, it might be done in about 2 hours. The next stage is the build stage. The packages will be built inside the inner toolchain. After a successful build of a package it will be placed inside /usr/pkgsrc/packages. The name of the package will be added to the bulklog/meta/success file. If a package fails it saves its logfiles inside the bulklog directory for further investigation. The name of the package will be added to the bulklog/meta/errors file. In the two last stages no further compiling will happen. The report phase will create a bulk report, the upload stage will upload the all the packages which are allowed to be distributed.

With this mechanism pbulk knows what packages need to be built when the bulk build has been interrupted. The all in one shell script for all the operation is: /usr/pkg_bulk/bin/bulkbuild

The different stages of the bulk build can be started manually by calling the shell script wrappers:


 STAGE       |   SHELLSCRIPT-WRAPPER
 ------------+-----------------------------------------
 pre-build   |    /usr/pkg_bulk/libexec/pbulk/pre-build
 scan        |    /usr/pkg_bulk/libexec/pbulk/scan
 build       |    /usr/pkg_bulk/libexec/pbulk/build
 report      |    /usr/pkg_bulk/libexec/pbulk/report
 upload      |    /usr/pkg_bulk/libexec/pbulk/upload

Bulk building with pbulk in a sandbox


If you don't want to render your system useless due to the removal of all packages you can use a sandbox to build all the stuff in. The script mksandbox helps you to setup a sandbox. The script is located in mk/bulk inside your pkgsrc directory. Here is how it works:

       mkdir /usr/sandbox
       cd /usr/pkgsrc/mk/bulk
       ./mksandbox /usr/sandbox

A sandbox will be created for you. Basically a sandbox is small NetBSD root with device nodes and a kernel. The script mksandbox sets alot of null mounts to build such a system. After you created a sandbox you can chroot to the sandbox by:

       chroot /usr/sandbox /bin/ksh

Just build the toolchain as described above. You can exit the sandbox by typing the command "exit".


Parallel bulk builds

In order to use pbulk for parallel bulk you need to define a master server and several client nodes who will assist the build. This can be done in the pbulk config file by setting the master_mode to "yes". The master_ip should contain a single IP for the master server. The scan_clients and the build_clients can contain a list of servers. The list is enclosured with quotes, the IPs are seperated by whitespaces. The scan_clients help the master by resolving dependencies and send the data back to the master_ip. The master itself assembles a list with a correct build order out of it. Make sure to set up 2 ports for the communication between the master and the clients. In order to start a parallel build a few things need to be done: a.) Setup a pbulk outer toolchain and configure it b.) Build a bootstrap package with a modified mk.conf and tar and gz it. c.) Copy the outer toolchain to the clients (/usr/pkg_bulk)

In order to avoid duplicating the distfiles a NFS mount is a very useful thing. It is a must that every client has access to the packages which have just been build.

It's a good idea to share the full /usr/pkgsrc from the master server and mount it on the clients. However, make sure that the WRKOBJDIR is on the local machine.

In order to start a parallel bulk build run /usr/pkg_bulk/bin/bulkbuild on the master server and start the clients with /usr/pkg_bulk/libexec/pbulk/scan-client-start.

Issues/Changes on non-NetBSD platforms

On non-NetBSD things are a little bit different from the NetBSD platforms. The main differences is that they are pkgsrc bootstrapping platforms. The have to bootstrap pkgsrc in order to use it.

First steps, just build the outer toolchain as described in the HOWTO so far. In the second step you have to build a bootstrap file, gzip it and put it the name in the variable bootstrapkit= inside the pbulk.conf configuration file.


Helping the community

Please post the results to pkgsrc-bulk@NetBSD.org and put the report files on a webserver. The maintainers and the developers of the packages will have the chance to view your build error and improve the pkgsrc system.


Frequently asked questions

Q: How do I resume a bulk build?
A: Just run the appropicate wrapper script from /usr/pkg_bulk/libexec/pbulk.

Q: The packages have been scanned but there is no building at all. What's wrong?
A: Not all the dependendancies have been resolved successfully. See the presolve.log logfile in /usr/pkg_bulk/bulklog/meta/. Look for the the string 'No match' - resolve the problems and restart the bulk build.

Q: The build of pkgtools/pbulk fails with "nroff: Cannot find library -mandoc". What is wrong here?
A: Most likely you run pkgsrc on a non NetBSD system, such as Linux. You can solve the problem by installing the groff package and call groff instead of nroff.

Q: I am running pbulk on Solaris and it tries to build thousands of dependencies when I try to build pbulk.
A: That's a nasty problem. Inside the toolchain for pbulk the tools groff and nroff are being used. The pbulk toolchain tries to build them in order to satisfy all dependcies. Usually it breaks somewhere inside postscript. Use the following patch to the pbulk Makefile and export the environment variable NROFF. You won't be able to read any pbulk manpages but the framework compiles just nicely.


 --- Makefile	14 Jul 2008 13:02:00 -0000	1.43
 +++ Makefile	30 Jul 2008 09:18:13 -0000
 @@ -19,7 +19,7 @@
  
  USE_FEATURES=	nbcompat
  USE_TOOLS+=	awk:run bzip2:run digest:run gzip:run ident:run make:run \
 -		mail:run sed:run tar:run groff nroff
 +		mail:run sed:run tar:run
  DEPENDS+=	rsync-[0-9]*:../../net/rsync
  
  .include "../../mk/bsd.prefs.mk"

don't forget to export the nroff command:

 export NROFF=echo

If you forgot to export the NROFF variable the following error will occur:

 gcc -O -I/usr/pkg_bulk/include  -Wall -Wstrict-prototypes -Wmissing-prototypes -Wno-uninitialized -Wreturn-type -Wpointer-arith -Wcast-qual
 -Wwrite-strings -Wswitch -Wshadow -Werror  -DHAVE_NBCOMPAT_H=1 -I/usr/pkgsrc/pkgtools/pbulk/work/libnbcompat -I/usr/pkg_bulk/include
 -I/usr/pkgsrc/pkgtools/pbulk/work/pbulk/presolve/../lib -c presolve.c
 gcc -L/usr/pkgsrc/pkgtools/pbulk/work/libnbcompat -L/usr/pkg_bulk/lib -Wl,-R/usr/pkg_bulk/lib  -o pbulk-resolve presolve.o -lsocket -lresolv
 -L/usr/pkgsrc/pkgtools/pbulk/work/pbulk/presolve/../lib -lpbulk -lnbcompat
 nroff -Tascii -mandoc pbulk-resolve.1 > pbulk-resolve.cat1
 nroff: Cannot find library -mandoc
 
 *** Error code 1
Personal tools