The pkgsrc portability guide

From NetBSD Wiki

Jump to: navigation, search

pkgsrc is a package management system that has been ported to various POSIX-like operating systems. These systems differ in a lot of small details that are all nice to know when writing code that will later run on as many of these platforms as possible. This article collects some of the limitations, bugs and other characteristics of the pkgsrc platforms.

Contents

Introduction

Writing programs that are portable to a wide range of platforms isn't easy. Maybe some platforms are lacking your favorite tool, while others implemented it, but with arbitrary limitations. Some don't have such useful functions like snprintf or strlcpy.

This book provides information about which features are not on the platforms, since the features that are available are already documented quite well. It also explains how portable shell programs look like, since that's also a topic that hasn't spread wide yet.

Since pkgsrc uses the POSIX tools quite a lot, and the packages using all other features, it's good to know on which platforms a certain program will not work without further work.

Utilities

This chapter collects the various bugs and peculiarities of the POSIX-like utilities on the platforms.

find

On IRIX 6.5, find does not know the -xdev option, which is used by some parts of the pkgsrc infrastructure, for example, print-PLIST.

install

NetBSD's install copies the file flags from the source files. Just in case you have protected your distfiles with the uchg flag, this means that the installed files will have the same flag if you install them directly using some code like ${INSTALL_*} ${DISTDIR}/foo.txt ${PREFIX}/share/doc/foo.txt.

ls

See also The Open Group's specification.

On older MacOS X releases, ls does not return an error code when the file to be listed does not exist. This is fixed from OS X 10.5.

sh

See also The Open Group's specification.

Non-standard extensions

  • The variable RANDOM is an extension of ksh and bash.
  • The command [[ is a reserved extension of ksh and bash.
  • The function keyword is an extension of ksh and bash. See the Functions section in the ksh(1) manpage for more information on how to convert functions declared with the function keyword to ones without (usually you can just remove the keyword and it works).

NetBSD

On NetBSD, have a look at the various PRs to see what's wrong with the shell.

Solaris

On Solaris, /bin/sh is missing some features that POSIX requires for the sh utility:

$ if ! false; then echo ok; fi
!: not found
$ echo ${PWD%/}
bad substitution
$ foo=$(true)
syntax error: `foo=$' unexpected

There is another sh implementation in /usr/xpg4/bin, which implements these features. However, many shell scripts and other programs have /bin/sh hard-coded.

Another incompatibility is this:

$ set -- foo bar baz; echo "before: $#"; set --; echo "after: $#"
before: 3
after: 3

All(?) other shells, including /usr/xpg4/bin/sh, reply with <quote>after: 0</quote>.

On Solaris, both /bin/sh and /usr/xpg4/bin/sh cannot handle empty for loops:

$ for i in ; do echo "i=$i"; done
syntax error: `;' unexpected

The work-around is either to add some dummy arguments or to save the list of things in a variable. Looping over empty lists is no problem. But note that the various parameter expansions that are applied here differ in this case.

$ things=""; for i in $things; do echo "i=$i"; done
Globbing problems with quoting and hidden files
rm -rf "testdir"
mkdir "testdir"
touch "testdir/.file"
x="."
y=".file"
echo "=== x and y quoted"
ls "$x"/*/"$y"
echo "=== only x quoted"
ls "$x"/*/$y
echo "=== nothing quoted"
ls $x/*/$y

The result of running this code with /bin/sh is that in the first case, the result of the globbing expansion is ./*/.file, while in the two other cases, it is the correct <quote>./testdir/.file</quote>. This does only happen when y starts with a dot, that is, for hidden files.

cd /nonexistent

One thing that the Solaris /bin/sh is absolutely unable to do is proper error handling in the "set -e" mode, especially for builtin commands.

The following command should work and print an error message, followed by "Does not exist.".

$ sh -c 'cd /non || echo "Does not exist."'

Let's see what happens:

$ uname -sr
NetBSD 4.99.4
$ /bin/sh -c 'cd /non || echo "Does not exist."'
cd: can't cd to /non
Does not exist.

$ bash -c 'cd /non || echo "Does not exist."'
bash: line 0: cd: /non: No such file or directory
Does not exist.
$ uname -sr
IRIX64 6.5
$ /bin/sh -c 'cd /non || echo "Does not exist."'
/bin/sh: /non:  not found
Does not exist.

$ /bin/sh -c 'cd /non || echo "Does not exist."'
/bin/sh: /non:  not found
Does not exist.
$ uname -sr
SunOS 5.10

$ /bin/ksh -c 'cd /non || echo "Does not exist."'
/bin/ksh: /non:  not found
Does not exist.

$ /bin/sh -c 'cd /non || echo "Does not exist."'
/bin/sh: /non: does not exist

So all shells execpt the Solaris /bin/sh got it right.

Oops, I just noticed that I had left out the "-e" option, which I really wanted to test. This only makes the result even worse: The Solaris /bin/sh quits, although it wasn't told to do so. By the way, the other shell's output doesn't change when adding the "-e" option.

"set -e" means automatic error checking

The other way round is equally bad: When the builtin "[" command fails, the shell does not exit.

On every system except Solaris ...

$ /bin/sh -e -c '[ "" ]; echo "This should never be seen."'

But on Solaris:

$ sh -e -c '[ "" ]; echo "This should never be seen."'
-c: -c: cannot open
$ /bin/sh -ec '[ "" ]; echo "This should never be seen."'
This should never be seen.

Outch. That can't be true: /bin/sh can only handle one argument with options. And even when the two options are put together in one argument, it still fails.

But there are two good news:

  • first, simple assertions (like above, but without the following echo) work.
  • second, the POSIX standard does not require that /bin/sh be a conforming shell, it just requires that some conforming shell can be called as sh. That shell is in /usr/xpg4/bin.
Too many subshells
vars="a"
for i in 1; do
	read a || die
	echo "inside: a=$a"
done <<EOF
value for a
EOF

echo "outside: a=$a"

The resulting output is:

inside: a=value for a
outside: a=

Believe it or not, the Solaris /bin/sh executes the for loop in a subshell, just because of the input redirection. But this is nowhere documented.

Bug when reading null bytes

Don't try to read line < binaryfile in either /bin/ksh or /usr/xpg4/bin/sh on Solaris 9, these shells will hang.

Sun knows this, but they do not want the world to know the details.

Variable assignment isn't trivial

$ for a in /bin/sh; do foo=bar break; done; echo $foo

The result should be that foo is still undefined.

  • Shells that get it wrong are: Solaris /bin/sh, ksh, NetBSD /bin/sh, IRIX sh and ksh and Mac OS ksh.
  • Shells that get it right are: DragonFly /bin/sh, zsh.

shift

According to the specification, the shift utility is a "special builtin command", which means that when it "can't shift that many", the shell MUST exit, unless it's an interactive shell. See PR 37493.

test

See also The Open Group's specification, which contains many useful and instructive examples in the Application Usage section.

When using test or its near relative, [, pay attention to what the arguments can be. If any of them might or could start with a hyphen or be an exclamation mark (!), better write the test like this:

test ":$1" != ":-foo"
test "x$1" != x"-foo"

On the other hand, if you can guarantee that the argument is always a "safe" string, you can use one of the following forms:

test "$1" != "foo"
test $1 != foo

Otherwise, some implementations of this utility might fail, and believe me, there are many different implementations flying around in the world.

Supported options

It is expected that the test utility fully supports the POSIX standard. Any deviations from it are recorded in the “Unsupported” column.

Supported options of the test utility
Platform Unsupported Supported Extensions
AIX-4.3-* S bcdefghLnpr-stuwxz k
IRIX-6.5-* S bcdefghLnpr-stuwxz k
Linux-2.6.*-* bcdefghLnprSstuwxz GkO nt ot ef
NetBSD-3.0-* bcdefghLnprSstuwxz GkO nt ot ef
SunOS-5.10-* /bin/sh eS bcd-fghLnpr-stuwxz k
SunOS-5.10-* /bin/test bcdefghLnprSstuwxz aGkOo nt ot

Remarks:

  • On SunOS-5.10-*, the behavior of the -f operator depends

on the value of the environment variable PATH.

  • On SunOS-5.10-*, /usr/ucb/test behaves like the builtin command in /bin/sh.

On the other hand, /usr/bin/test conforms to POSIX.

Other non-standard extensions

The Bourne Again Shell (bash) and some implementations of ksh allow the binary operator == to be used as an alias for =. This leads to problems when programs using this feature are ported to other platforms. By the way, not even the GNU coreutils have that <quote>feature</quote>. After all, there is no apparent benefit for having two names for the same operator.

Bugs

On NetBSD upto X.Y (see PR 34646), test cannot handle the following:

test ! = foo

POSIX requires that when test is called with three arguments, the second operand is checked first to see whether it is a binary operator. On NetBSD, the unary “!” operator takes precedence.

Compilers

  • MIPSpro does not take the #error directive as being an error, but only a warning. That is, the exit status is still zero, meaning success. To fix this, you have to pass it the -diag_error 1035 option. By now, this is done automatically by the pkgsrc infrastructure (mk/compiler/mipspro.mk).
Compiler feature matrix
Feature ccc gcc3 gcc4 icc mipspro sunpro xlc
__func__ in C++  ? yes yes  ? no no  ?

See also:

GNU extensions to ISO C99

The GNU compilers implement many extensions to ISO C99. Since there are many programmers that don't even know that there is a standard for the C programming language, they do everything that GCC provides them with. That makes it difficult to port code to other platforms.

varargs macros

The GNU compilers accept the following definition of a varargs macro:

#define my_printf(fmt, args...) printf(fmt, args)

ISO C99 only defines a different form of varargs macros (which GCC also implements).

#define my_printf(fmt, ...) printf(fmt, __VA_ARGS__)

Additionally, ISO C99 says that there shall be more argument in the invocation than there are parameters in the macro definition (excluding the ...) (ISO C99, 6.10.3#4). That is, the above my_printf macro must be called with more than one argument. The best definition is therefore:

#define my_printf(...) printf(__VA_ARGS__)

Block expressions ({...})

The closest replacement is to convert the block expressions to inline functions.

Device files

This chapter lists the device files that are available on the platforms and whether they are compatible to other platform's devices.

stdin, stdout, stderr

These device files, although they are found on many systems, are not required by POSIX. They also do not exist on IRIX.

C/C++: System-defined macros

EX_OK

On IRIX 6.5, <unistd.h> defines EX_OK (value 020) as a constant for the second parameter of access(2), to test whether a file is executable. On the same system, in the file <sysexits.h>, the same macro has a different value (0) and purpose; here it is an alias for EXIT_SUCCESS.

C/C++: Headers

Sometimes, the order of inclusion matters

  • On IRIX 6.5, <stdint.h> should be included before <wchar.h>, because there are conflicting definitions for WCHAR_MIN and some other macros, and only <wchar.h> has the necessary protection against multiple definitions.

C/C++: Functions

This chapter contains the requirements for using certain functions in C and C++ code.

Feature test macros

When writing code that conforms to a specific standard, some programmers may want to check that really only features from that standard are used. For that purpose there exist a number of so-called feature test macros that can be defined on the command line to the compiler or in C and C++ source files before including any of the standard include files.

Documentation of the feature test macros
Platform Location
SunOS-*-* standards(5)
NetBSD-*-* /usr/include/sys/featuretest.h
Linux-*-* /usr/include/features.h

The feature test macros include, but are not limited, to:

Common feature test macros
Macro Explanation
_POSIX_C_SOURCE The Open Group's web site
_XOPEN_SOURCE The Open Group's web site
_GNU_SOURCE The GNU libc
_NETBSD_SOURCE NetBSD's feature-test.h

See also Feature test macros

Functions on various platforms

asprintf

Solaris 10 does not have this function. But it has snprintf, which can be used as a good replacement.

cfmakeraw

Solaris doesn't have this function, but you can just copy-and-paste the code from NetBSD.

cfsetspeed

Solaris doesn't have this function, but you can just copy-and-paste the code from NetBSD.

getopt

Solaris provides the getopt function in three of the standard headers: <stdio.h>, <stdlib.h> and <unistd.h>, but the latter definition depends on other feature test macros than the first two. Additionally, since Solaris 5.10, there is also <getopt.h>, which provides getopt_long, but not getopt. The exact behavior also depends on the compiler. While SUNpro is quite strict and requires the feature test macros, gcc finds the definition without any of these macros. (TODO: Investigate further.)

Platform Library Headers Feature test macros
NetBSD-*-* c unistd.h _POSIX_C_SOURCE >= 2 || _XOPEN_SOURCE >= 4 || defined(_NETBSD_SOURCE)
SunOS-5.10-* c unistd.h (_XOPEN_SOURCE && _XOPEN_VERSION == 4) || __EXTENSIONS__

inet_aton

Needs -lresolv on Solaris.

nanosleep

On Solaris, the rt library is needed.

statfs and statvfs

The statfs function is not standardized by The Open Group, and there exist several different implementations. It has been deprecated in many systems and will be replaced with statvfs.

  • IRIX: int statfs (const char *path, struct statfs *buf, int len, int fstyp);
  • Linux 2.6: int statfs(const char *path, struct statfs *buf);
  • NetBSD 3: not available, int statvfs(const char *path, struct statvfs *buf); is as well as other statvfs variants
  • Solaris 10: int statfs (const char *, struct statfs *, int, int); (deprecated)

Missing functions

Function name Platforms Remarks
truncf NetBSD-3.0-* You can use rintf() instead.
trunc NetBSD-3.0-* You can use rint() instead.

C/C++: Reserved names

There are many variable names that should not be used because they are defined as macros in some system headers of certain operating systems.

Name Appearance
lines Solaris <term.h>
st_atime NetBSD <sys/stat.h>
st_ctime NetBSD <sys/stat.h>
st_mtime NetBSD <sys/stat.h>

TODO: Build a database of all macros that are defined in all header files by all operating systems and pkgsrc packages.

See also

Personal tools