openbsd-ports/infrastructure/mk
thfr 9dd09edd61 use comma in the substitution instead of '/', allows using gitlab
subgroups as raised by lraab@

ok espie@
2023-09-24 18:57:08 +00:00
..
README.internals add some significant words wrt places that must be kept in synch with *mk 2023-08-10 18:50:58 +00:00
apache-module.port.mk
arch-defines.mk bump _SYSTEM_VERSION-amd64 so that after the next snapshot build, pkg_add -u 2023-06-07 13:00:33 +00:00
automake.dep
bsd.port.arch.mk
bsd.port.mk only give a value to ALL_$v and _PATH_$v if they're not defined at all 2023-09-22 07:04:41 +00:00
bsd.port.subdir.mk
compiler.port.mk
cpan.port.mk Rather than using $cpan_mirror/modules/by-module/../by-authors/, use 2023-07-19 11:17:11 +00:00
dist-tuple.port.mk use comma in the substitution instead of '/', allows using gitlab 2023-09-24 18:57:08 +00:00
font.port.mk Eliminate redundant variables FONT_DISTDIR and FONT_DISTSUBDIR. 2023-09-14 03:51:43 +00:00
fortran.port.mk
gcc4.port.mk
gnu.port.mk
imake.port.mk
java.port.mk
modules.port.mk gc last remnants of DIST_TUPLE_MV 2023-09-06 09:51:34 +00:00
perl.port.mk
pkgpath.mk

README.internals

$OpenBSD: README.internals,v 1.19 2023/08/10 18:50:58 espie Exp $
Copyright (C) 2011-2012 Marc Espie <espie@openbsd.org>

 Permission to use, copy, modify, and distribute this software for any
 purpose with or without fee is hereby granted, provided that the above
 copyright notice and this permission notice appear in all copies.

 THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.


This file is *not* user documentation of the *mk files. The full user
documentation is available as manpages such as bsd.port.mk(5)

There are comments in bsd.port.mk but this file is already too long,
so I finally decided to put the design notes in a separate file.

Most of this is not for the faint of heart.

"Imagination could conceive almost anything in connexion with this place."
(H.P. Lovecraft, At the Mountains of Madness)

IMPORTANT
---------
There are connections to other pieces outside of this directory.
The "infrastructure" consists of
- those pieces of mk code
- some internals of OpenBSD make
- users of introspection output, most specifically dpb, sqlports AND
pkglocatedb
- the man pages to this infrastructure, located under share/man[57]

Any change that impacts *mk *MUST* also take these into account
- new variables must be represented in sqlports
- dpb reproduces a significant amount of behavior of *mk internally (all the
fetching/checksumming stuff among others), so checking that dpb is still
happy after significant changes is mandatory.
- any new mechanism must be documented, preferably sooner rather than later.
Documentation is a very good litmus test: if you can't explain it simply,
chances are your new feature is NOT FINISHED.


Some design principles
----------------------
- all variables and targets prefixed with _  are for internal use.
It may be that some other tools reuse them, but that's generally following
the same guidelines: no user serviceable parts inside (and you should always
ask ME about it, since I'm liable to change these with very little notice).
This is an attempt to avoid exposing too many knobs, and a clear distinction
between "supported" stuff, and "you're on your own, if you abuse this, bad
things will happen".

- most variables will always be defined, even with an empty value.  This avoids
stupid .if defined(TESTS).

- bsd.port.mk has a strict separation between variable definitions (at top)
and target definitions (right after the
	    ###
	    ### end of variable setup. Only targets now
	    ###
comment). This is because of make's "lazy" way to evaluate things:
constructs in shell-lines get evaluated very very late. But all
targets (such as ${_WRKDIR_COOKIE}) get evaluated when they're
encountered. There was some significant effort in the early days
when I took bsd.port.mk over to achieve that separation.  Be very
careful with Makefile.inc and modules, since those get included at
a very specific location in the Makefile.  They should mostly define
variables. Makefile.inc is included very early so that it can
override about anything it should be able to (but then, nothing is
already defined except for read-only variables / global user-tweakable
defaults). Modules is included a little bit later...

- a lot of the code is done through shell fragments. All of these are
internal and prefixed with _.  This is done for speed and compactness (forking
an external shell would be very slow, and a total nightmare with respect
to parameter passing).  These often communicate through internal shell
variables with well-defined names.

- the infrastructure tries very hard to have real files (as far as possible)
attached to most targets: "make package" points to the actual pkgfile(s)
being generated. There's a whole section of the makefile devoted to
targets generated from the distfiles devoted to fetch and fetch-all
(see the section around _EVERYTHING for variable definition, and the
    # Separate target for each file fetch-all will retrieve
for the actual targets). Apart from that, once a port starts building,
targets "without" an actual file will generate a cookie under WRKDIR or
some subdir of it (see all the _COOKIE* definitions). Of particular interest
are dependencies, where the cookie file is generated depending on the
name of the dependency: that's the infamous
.for _i in ${_DEPLIST}
${WRKDIR}/.dep-${_i:C,>=,ge-,g:C,<=,le-,g:C,<,lt-,g:C,>,gt-,g:C,\*,ANY,g:C,[|:/=],-,g}: ${_WRKDIR_COOKIE}
loop.  The reason for that complexity is that this avoids rechecking
dependencies while working on a port, since only the new stuff will need
evaluation...
In general, make works really bad when you write
phony-target1: phony-target2
	@do_some_shit
as phony targets are always out-of-date, and thus you will always do_some_shit.

Exceptions: generating readmes is a pure phony target, because we want
to regenerate those every single time.

- a lot of computations are very expensive, so there are shortcuts all over
the place whenever a variable is empty, especially for dependency computation.

Introspection
-------------
There's a lot of stuff in the infrastructure which is there only for
introspection purposes: show, show-all, print-plist-*, dump-vars,
*-dir-depends, print-update-signature.
They allow external tools to interrogate the ports tree and get most
information from it. dpb and sqlports rely very heavily on it.

Significant work has been done to achieve better MI. The MULTI_PACKAGES
code now assumes you will define MULTI_PACKAGES *independently of the
architecture*, annotate subpackage with ONLY_FOR_ARCHS-sub if appropriate,
then .include <bsd.port.arch.mk> (which was split off the main file
specifically for that purpose), and rely on BUILD_PACKAGES for building.

There's also some extra introspection elsewhere, for instance,
sqlports is actually built using
make dump-vars 'LIBCXX=${LIBCXX}'...
so that WANTLIBs get the variable name instead of its current values.
This makes sqlports slightly more independent of clang/gcc.

Shell tricks and constructs
---------------------------
- IFS is your friend: echo "a:b:c"| while IFS=: read a b c
will split things along any character without requiring external commands
such as cut.

- don't run test for checking variable values. Rather:
case "$$subdir" in *,*)  ... ;; *) ... ;; esac
is entirely internal

- the shell has booleans. Set variables to true/false to achieve boolean
conditions. Note that those are usually built-ins, and hence even faster
than you would think.

- () forks a subshell.  If you need to syntactally keep a list of commands
together, "{ cmd1; cmd2; }"   is the way to do it.

- prefer if/then/fi shell constructs to complicated && / || expressions.
Especially since the shell has tricky precedence rules.

- use trap to clean up at the end of a command. That's mostly used for
caching stuff (depfile and cache), but also for the locking mechanism.
beware of existing traps in shell fragments (_depfile_fragment,
_cache_fragment, _SIMPLE_LOCK).  Re-using trap will reset. The only way
around that is to fork a subshell (see the subpackage target).

- exec is often used for tail recursion: it replaces the shell with the
executed command. BUT beware of pending traps, as you will lose them.

- all of make commands are executed under -e. Thus, it's deadly to fail
without a test around it. In the end, you might end up with:
    if ( eval $$toset exec ${MAKE} subupdate ); then
this forks a subshell to avoid the unpleasantness of -e, then execs the
resulting command to avoid piling up an extra process.

Make vs. Shell
--------------
There are at least *4* kind of variables in those files:
- internal shell variables
- environment variables
- make variables set on the command line
- make variables set within a makefile

Note that make variables set on the command line ARE set in stone: it's very
difficult to change them from within the Makefile, and they will override
mostly anything you can do. They also appear in the environment.

Thus, a lot of make thingies, such as FLAVOR and SUBPACKAGE must be set in
the environment because make will run into subdirs and requires them to
be changeable.

Also, note that stuff set within Makefiles is not exported to the environment,
you have to be explicit and set them when you run a command.

When things get ambiguous, This document uses $$var for a shell variable,
${VAR} for a variable set in make, and VAR for an environment variables.

VAR != cmd  will run a command *every time* in make. Prefer
VAR !!= cmd   which will run it only when needed.
(the old semantics was kept because some makefiles use != for the side effect
of always running the command and don't care about the value)

pkgpath.mk
----------
Named that way because it mostly deals with pkgpath parsing, but in reality,
it's mostly a lift-up from the common parts between bsd.port.mk and
bsd.port.subdir.mk (no reason for a name change so late in the game)

fragments and common shell variables
------------------------------------
_pflavor_fragment (in pkgpath.mk)
	takes $$subdir as a pkgpath parameter.
	will parse it into a $$dir and $$toset variables, perform a cd $$dir
	so that eval $$toset ${MAKE}   gets you into that
	dependency (this will set PKGPATH, FLAVOR, SUBPACKAGE accordingly)

	depending on $$_fullpath (true/false), will consider the absence
	of flavor to be an empty flavor (e.g. $$toset contains FLAVOR='' )
	or to be the default flavor (e.g. unset FLAVOR)

	takes ${PORTSDIR_PATH} into account

	succeeds if the dependency is correct and the directory is found.
	Otherwise will emit an error message that ends with $$extra_msg.

_flavor_fragment (in pkgpath.mk)
	calls ${_pflavor_fragment} after imposing $$_fullpath=false. e.g., for
	actual dependencies.

_depfile_fragment (in pkgpath.mk)
	sets _DEPENDS_FILE to a temporary file in the environment if it's
	not already set. Set a trap to remove it on end.
	_DEPENDS_FILE is used by the *dir-depends and show-run-depends targets
	to not go thru the same directory twice. Beware of setting it
	"too globally", since it prevents directory walking from happening
	twice.

_cache_fragment (in pkgpath.mk)
	sets _DEPENDS_CACHE to a temporary directory in the environment
	if it's not already set. Set a trap to remove it on end.
	_DEPENDS_CACHE is used to store library lists in _libs2cache.
	pkg_create(1) also knows about it and use it to store plists.
	the ${SUDO} part in the trap is because some of the caching happens
	during ${SUDO} pkg_create.
	XXX also sets PKGPATH, as it reduces drastically the number of
	PKGPATH=... needed in bsd.port.mk

_read_spec/_parse_spec (in bsd.port.mk)
	normally used in a pattern like:
	${_emit_run_depends}|while ${_read_spec}; do
		${_parse_spec};
		...
	done
	takes specs, one per line, sets $$pkg $$subdir $$target,
	and calls ${_flavor_fragment} to cd to the correct dir
	and set $$toset.  Note that this takes "processed" specs:
	transformation of pkgpath>=0.0 -> STEM->=0.0:pkgpath must have
	already happened upfront (which happens in place for all
	*DEPENDS variables).
	This sets $$d to the spec being processed, which is probably a
	bad variable name...

_emit_run_depends/_emit_lib_depends (in bsd.port.mk)
	writes out lists of specs suitable for ${_parse_spec}/${_read_spec}.

_compute_default (in bsd.port.mk)
	comes after ${_pflavor_fragment}, and obtains information from
	the dependent port: sets $$default to the default pkgname,
	$$pkgspec to the default pkgspec (see PKGSPEC), and $$pkgpath
	to the actual pkgpath, when default flavors and subpackages are
	taken into account (hence suitable for FULLPATH=Yes/$$_fullpath=true
	Used alone when we only need $$default, as ${_complete_pkgspec} is
	slightly more expensive.

_complete_pkgspec (in bsd.port.mk)
	comes after ${_pflavor_fragment}, obtains information from
	the dependent port through ${_compute_default}, then make sure
	$$pkg has the correct value.

_libs2cache (in bsd.port.mk)
	requires properly set dependency information obtained through
	${_flavor_fragment} (often through ${_parse_spec}).
	assumes ${_depcache_fragment} has been called, either directly,
	or from an upper level target.
	If not already available, secure the library list from the
	dependency and store it in a file under _DEPENDS_CACHE.
	Sets $$cached_libs to the file name.

Locking
-------
Done thru _DO_LOCK and _SIMPLE_LOCK mostly. Most user-visible targets
redirect through
	${_DO_LOCK}; make _internal-$@
as soon as locking is used (which is the default these days).
The locking mechanism carefully maintains the _LOCK_HELD environment variable
so that dependencies can relock themselves (yeah, it looks strange, but this
happens !).

_SIMPLE_LOCK only exists to provide separate locking targets for fetch: it
assumes $$lock has been set to an appropriate lock file name.


Internal targets are:
_internal-clean _internal-package-only _internal-run-depends
_internal-runlib-depends _internal-runwantlib-depends _internal-package
_internal-build-depends _internal-buildlib-depends
_internal-buildwantlib-depends _internal-test-depends
_internal-prepare _internal-depends _internal-fetch-all
_internal-fetch _internal-all _internal-build _internal-checksum
_internal-configure _internal-deinstall _internal-extract _internal-fake
_internal-patch _internal-plist _internal-test _internal-subpackage
_internal-subupdate _internal-uninstall _internal-update
_internal-update-or-install _internal-update-or-install-all
_internal-update-plist _internal-distpatch _internal-generate-readmes
(yeah, that's a lot).

So, if you type "make", it will do
make build -> lock && make _internal-all

then make _internal-all will normally run thru
_internal-fetch _internal-extract _internal-patch _internal-build
without ever releasing the lock, thus preventing anything else to sneak
in and break the build.

The difference between _internal-package and _internal-package-only is only
due to BULK=Yes, which has to happen exactly once after a user-level target
that involves packaging is finished.

Misc
----
- Modules inclusion is done through a separate modules.port.mk to handle
recursion: modules may trigger the inclusion of other modules, thus
modules.port.mk will re-include itself until the whole list is done.

- COMPILER support is somewhat meshed with modules.port.mk, since it's
a back&forth with arch-defines, the gcc4 module, and the clang module.
It's got special treatment, because COMPILER must happen before gcc4
and clang, while at the same time some modules (qt mostly) may want to
set COMPILER.

- MULTI_PACKAGES and SUBPACKAGE always get set to a non-empty value:
if MULTI_PACKAGES is not set, it ends up as MULTI_PACKAGES=- and
SUBPACKAGE=-. This does make loops over all subpackages simpler, e.g.,
.for _s in ${MULTI_PACKAGES}  since make doesn't do anything if the list
is empty...
As a result, most subpackage-dependent variables end up being used as their
"true" VAR-  form  in the case of a port without multi-packages.
Exception: FULLPKGNAME needs some specific code for the !multi-packages case.
There's also a special case for the PKGDIR stuff, which isn't suffixed with '-'
in the non multi-package case.
Also, dump-vars distinguishes the !multi case.

Note that there are a few checks for empty SUBPACKAGE, these allow the whole
bsd.port.mk file to parse, so that we end up with an actual user error
message instead of some weird internal error.

Dependency variables
--------------------
Old legacy dependency styles are gone, so the compat code went away.
The code has to deal with BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS,
TEST_DEPENDS, WANTLIB and RUN_DEPENDS-* LIB_DEPENDS-*, WANTLIB-*

There is obviously completely different treatment for WANTLIB* and
*DEPENDS*. Also, BUILD* stuff gets to inherit from LIB_DEPENDS* stripped
of inter-dependencies, and RUN-s stuff must have LIB_DEPENDS* from
inter-dependencies.


First, the framework substitutes in place constructs like
pkgpath>=version  into STEM->=version:pkgpath

Then it accumulates dependency into build-time dependencies:
- collect LIB_DEPENDS and LIB_DEPENDS-* as _BUILDLIB_DEPENDS
- collect WANTLIB and WANTLIB-* as _BUILDWANTLIB
(strip any interdependencies)

Symetrically:
- define _LIB4-* as the interdependencies in LIB_DEPENDS-*
- define _LIB4 as the full list of all _LIB4-*

- define _BUILD_DEPLIST, _RUN_DEPLIST, _TEST_DEPLIST, _BUILDLIB_DEPLIST,
_RUNLIB_DEPLIST as the dependency lists relevant to the SUBPACKAGE being
built (not set if NO_DEPENDS is Yes !)

This accumulates as _DEPLIST, which is the full list of all specs in every
dependency.

*2 and *3 have the :configure/:patch/:build modifiers stripped:
_BUILD_DEP2: BUILD_DEPENDS + _BUILDLIB_DEPENDS
_BUILD_DEP3: BUILD_DEPENDS
_BUILD_DEP3-*: _BUILD_DEP3

_RUN_DEP2: RUN_DEPENDS (+LIB_DEPENDS-s if shared)
_RUN_DEP3: RUN_DEPEND
_RUN_DEP3-*: RUN_DEPENDS-* (+LIB_DEPENDS-* if shared)

_LIB_DEP3: (LIB_DEPENDS-s if shared)
_LIB_DEP3-*: (LIB_DEPENDS-* if shared)

_TEST_DEP2: TEST_DEPENDS
_TEST_DEP3: TEST_DEPENDS

_DEPBUILDLIBS: _BUILDWANTLIB
_DEPRUNLIBS: WANTLIB-s

_BUILD_DEP is the _BUILD_DEP2 paths, without the specs
_RUN_DEP is the _RUN_DEP2 paths, without the specs
_TEST_DEP is the _TEST_DEP2 paths, without the specs

e.g., *2 is "all the shit relevant to the port at hand", *3 is
"only the shit directly relevant to the port with subpackages,
or everything pertinent to THAT given subpackage".

This gets converted into cookies relevant to each dependency
_DEPBUILDLIBS -> _DEPBUILDLIBSPECS_COOKIES as .spec-$i
_DEPRUNLIBS -> _DEPRUNLIBSPECS_COOKIES as .spec-$i
_DEPBUILDWANTLIB_COOKIE as .buildwantlibs
_DEPRUNWANTLIB_COOKIE as .runwantlibs-s

Then there are rules for each dependency: a loop over _DEPLIST
for each spec, and code to build both _DEP*WANTLIB_COOKIEs

There's also some dependency in handling in
*wantlib-args, lib-depends-args and run-depends-args.

_DEPENDS_TARGET: used to be DEPENDS_TARGET, and visible, but not documented
for years, and it can be set internally. Ports normally install their
dependencies, with the exception of "make reinstall", and "make update".

_FETCH_MAKEFILE: where to put the output of "fetch-makefile", by default
stdout, but overridden from the main Makefile.

_SHSCRIPT: prepend to a shell script in the infrastructure, e.g.,
    ${_SHSCRIPT}/update-patches

_PERLSCRIPT: likewise for perl scripts.

_ALL_VARIABLES _ALL_VARIABLES_INDEXED _ALL_VARIABLES_PER_ARCH:
	stuff to dump in dump-vars. First is "simple" variable, second one
	will depend on the subpackage, and for last one, must iterate
	over arches to get all relevant information.

_BAD_LICENSING:
note that the PERMIT_* variables are defined for subpackages as soon as
the main variables are defined, so it's enough to check the normal variables.

_lt_libs:
	derived from SHARED_LIBRARIES explicitly for libtool

_FLAVOR_EXT2: FLAVOR_EXT with pseudo-flavors included

_PKG_ARGS: list of stuff to append to base PKG_ARGS-sub based on a lot of
internal choices.

_README_DIR: location for the pkg-readmes, no user serviceable parts inside.

_MASTER: dependencies may be confusing, so set _MASTER to know where we come
from when patching/configuring only.

_DEPENDENCY_STACK: pkg_add -a to avoid manually installed packages.

_CHECK_DEPENDS: save all depends prior to tweaking them, just in case.


_PERL_FIX_SHAR make visible ?

TODO fully document:
	MAKE_JOBS
	PARALLEL_BUILD
	XAUTHORITY for interactive tests

MAKE_JOBS is set by dpb based on DPB_PROPERTIES=parallel or can be set
by the user. When MAKE_JOBS is >1, PARALLEL_MAKE_FLAGS is added to
MAKE_FLAGS, by default this contains -j${MAKE_JOBS} which works for
most ports. Some ports do MAKE_JOBS differently (libreoffice, jdk),
in those cases "make -j" is not wanted, so PARALLEL_MAKE_FLAGS
should be explicitly cleared.

Probably time to rename that and really document it.
fishy: FLAVORS tests ?
fishy: TEST_STATUS_IGNORE

fishy:
	GZIP
	GZIP_CMD
	M4
	STRIP

uninstall ? deinstall

Todo: kill PORT_LD_LIBRARY_PATH

Document: all recursive targets

TODO: document all cookies