Cirrus is about to shut down its macOS-on-Intel support, so it's time to
move our CI testing over to ARM instances. The Homebrew package manager
changed its default installation prefix for the new architecture, so a
couple of tests need tweaks to find binaries.
Back-patch to 15, where in-tree CI began.
Author: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20221122225744.GF11463%40telsasoft.com
The tests don't have much coverage of segment related code, as we don't create
large enough tables. To make it easier to test these paths, add a new option
specifying the segment size in blocks.
Set the new option to 6 blocks in one of the CI tasks. Smaller numbers
currently fail one of the tests, for understandable reasons.
While at it, fix some segment size related issues in the meson build.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20221107171355.c23fzwanfzq2pmgt@awork3.anarazel.de
To run all tests that support running against existing server:
$ meson test --setup running
To run just the main pg_regress tests against existing server:
$ meson test --setup running regress-running/regress
To ensure the 'running' setup continues to work, test it as part of the
freebsd CI task.
Discussion: https://postgr.es/m/CAH2-Wz=XDQcmLoo7RR_i6FKQdDmcyb9q5gStnfuuQXrOGhB2sQ@mail.gmail.com
We have coverage of the various sanitizers in the buildfarm. The sanitizers
however particularly interesting during the development of patches, where the
likelihood of bugs is even higher. There also have been complaints about only
seeing such failures on the buildfarm, rather than before commit.
This commit enables a reasonable set of sanitizers in CI. Use the linux task
for that, as it currently is one of the fastests tasks. Also several of the
sanitizers work best on linux.
The overhead of alignment sanitizer is low, undefined behaviour has moderate
overhead. Test alignment sanitizer in the meson task, as it does both 32 and
64 bit builds and is thus more likely to expose alignment bugs.
Address sanitizer in contrast somewhat expensive. Enable it in the autoconf
task, as the meson task tests both 32 and 64bit which would exacerbate the
cost.
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Discussion: https://postgr.es/m/20221121220903.kf5u7rokfzbmqskm@alap3.anarazel.de
To avoid unnecessarily spinning up a lot of VMs / containers for entirely
broken commits, have a minimal task that all others depend on.
The concrete motivation for the change is to use sanitizers in the linux
tasks. As that makes the tests slower, the start of the CompilerWarnings would
be delayed even more. With this change the CompilerWarnings only depends on
the SanityCheck task.
This has the added advantage that now the CompilerWarnings task is not
prevented from running by (most) test failures (particularly annoying when
caused by a test that is flappy in HEAD).
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20221002205201.injtofbx4ax4erww@awork3.anarazel.de
We don't need CIRRUS_ESCAPING_PROCESSES anymore as the whole tests now run
within a single script: block. We don't need NO_TEMP_INSTALL anymore it was
addressing an issue specific to vcregress.pl.
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Discussion: https://postgr.es/m/20221113235303.GA26337@telsasoft.com
For now the task has been set to be manually triggered, as we are already
limited by the amount of CI time available for windows, particularly on cfbot.
Author: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAGPVpCSKS9E0An4=e7ZDnme+y=WOcQFJYJegKO8kE9=gh8NJKQ@mail.gmail.com
This substantially speeds up building for windows, due to the vast amount of
headers included via windows.h. A cross build from linux targetting mingw goes
from
994.11user 136.43system 0:31.58elapsed 3579%CPU
to
422.41user 89.05system 0:14.35elapsed 3562%CPU
The wins on windows are similar-ish (but I don't have a system at hand just
now for actual numbers). Targetting other operating systems the wins are far
smaller (tested linux, macOS, FreeBSD).
For now precompiled headers are disabled by default, it's not clear how well
they work on all platforms. E.g. on FreeBSD gcc doesn't seem to have working
support, but clang does.
When doing a full build precompiled headers are only beneficial for targets
with multiple .c files, as meson builds a separate precompiled header for each
target (so that different compilation options take effect). This commit
therefore only changes target with at least two .c files to use precompiled
headers.
Because this commit adds b_pch=false to the default_options new build
directories will have precompiled headers disabled by default, however
existing build directories will continue use the default value of b_pch, which
is true.
Note that using precompiled headers with ccache requires setting
CCACHE_SLOPPINESS=pch_defines,time_macros to get hits.
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CA+hUKG+50eOUbN++ocDc0Qnp9Pvmou23DSXu=ZA6fepOcftKqA@mail.gmail.com
Discussion: https://postgr.es/m/c5736f70-bb6d-8d25-e35c-e3d886e4e905@enterprisedb.com
Discussion: https://postgr.es/m/20190826054000.GE7005%40paquier.xyz
Increase likelihood of discovering problems during CI rather the buildfarm by
defining -DRELCACHE_FORCE_RELEASE -DCOPY_PARSE_PLAN_TREES
-DWRITE_READ_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST on FreeBSD, and
-DRANDOMIZE_ALLOCATED_MEMORY on macOS.
FreeBSD and macoS are currently not CI bottlenecks, so we can afford to
increase their test times a bit.
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Discussion: https://postgr.es/m/20220910200542.GX31833@telsasoft.com
Test performance regresses noticably when using all cores. This is more
pronounced with meson than with autoconf, presumably because meson will
schedule the "full number" of tests more consistently. 8 seems to work
OK.
Discussion: https://postgr.es/m/20220927040208.l3shfcidovpzqxfh@awork3.anarazel.de
Backpatch: 15-, where CI was introduced
It's easy enough to make changes that break on 32bit platforms and few people
test that locally. Add a test for that to CI. LLVM is disabled on 32bit
because installing it would bloat the image size further.
The debian w/ meson task is fast enough that we can afford to test both.
Use the occasion of a separate run of the tests to run them under LANG=C,
we've recently discovered there's not a lot of testing in the buildfarm for
the case.
Discussion: https://postgr.es/m/4033181.1664395633@sss.pgh.pa.us
Ommitted in e6b6ea025c, but necessary for libintl to be found. All other
dependencies can be found via pkg-config (which searches in /usr/local/...)
and thus worked even without adding the directories.
The Windows task is changed to use meson as there currently is no way to run
all tests in the old MSVC build system (only ninja is covered for now, we
don't have enough CI resources to test msbuild as well).
To maintain autoconf coverage, the Linux task is duplicated to test both meson
and autoconf builds (linux is currently the fastest task). FreeBSD and macOS
are also converted to meson, as it seems more important to have coverage for
meson than autoconf.
Author: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Author: Justin Pryzby <pryzby@telsasoft.com>
Cirrus defaults to SetErrorMode(SEM_NOGPFAULTERRORBOX | ...). That prevents
crash reporting from working unless binaries do SetErrorMode()
themselves. Furthermore, it appears that either python or, more likely, the C
runtime has a bug where SEM_NOGPFAULTERRORBOX can very occasionally *trigger*
a crash on process exit - which is hard to debug, given that it explicitly
prevents crash dumps from working...
Discussion: https://postgr.es/m/20220909235836.lz3igxtkcjb5w7zb%40awork3.anarazel.de
Backpatch: 15-, where CI was added
CI builds recently started failing with:
"Memory size for 4.0 vCPU instance should be between 3840MiB and
26624MiB, while 2048MiB is requested."
Ok then, let's ask for 4G instead of 2G.
This may be due to a change in the type of instance used to work around
an outage, per:
https://twitter.com/cirrus_labs/status/1572657320093712384
freebsd 13.0 is out of support, switch to 13.1. It might be a good idea to
remove the minor version number from the image name, but there's not been a
response to that so far...
Discussion: https://postgr.es/m/20220728095704.ryywoaz4dqqrwstc@alap3.anarazel.de
Backpatch: 15-, just like the CI support
322becb has changed upgradecheck to use the TAP tests, discarding
pg_upgrade's tests in bincheck. However, this is proving to be a bad
idea for the Windows buildfarm clients that use MSVC when TAP tests are
disabled as this causes a hard failure at the pg_upgrade step.
This commit disables upgradecheck, moving the execution of the tests of
pg_upgrade to bincheck, as per an initial suggestion from Andres
Freund, so as the buildfarm is able to live happily with those changes.
While on it, remove the routine that was used by upgradecheck to
create databases whose names are generated with a range of ASCII
characters as it is not used since 322becb. upgradecheck is removed
from the CI script for Windows, as bincheck takes care of that now.
Per report from buildfarm member hamerkop (MSVC 2017 without a TAP
setup).
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YkbnpriYEAagZ2wH@paquier.xyz
cirrus-ci now defaults to killing processes still running at the end of a
script. Unfortunately we start postgres in the background, which seems
nontrivial to fix. Previously we worked around that in 770011e3f3 by using an
older agent version, but now that CIRRUS_ESCAPING_PROCESSES we should use that.
This reverts commit 770011e3f3 "ci: windows:
Work around cirrus-ci bug causing test failures.
Discussion: https://postgr.es/m/CA+hUKGKx7k14n2nAALSvv6M_AB6oHasNBA65X6Dvo8hwfi9y0A@mail.gmail.com
To improve performance of check-world, and improve debugging, without
significantly slower builds (they're cached anyway).
This makes freebsd check-world run in 8.5 minutes rather than 15 minutes.
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220310220611.GH28503@telsasoft.com
This is useful for patches during development, but once a feature is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220310220611.GH28503@telsasoft.com
Having the different types of logs in artifacts instruction makes it quicker
to navigate between them. Either that didn't use to be possible, or I just did
it wrong. I saw that / how it works in a patch by Justin Pryzby.
Also remove a few unnecessary uses of **, suggested by Justin Pryzby.
Discussion: https://postgr.es/m/20220203195718.smqo5xg4ygp5qktq@alap3.anarazel.de
startup_hacks() called SetErrorMode() with the SEM_NOGPFAULTERRORBOX argument
to prevent GUI popups on error. While that likely was sufficient at some
point, there are other sources of error popups.
At the same time SEM_NOGPFAULTERRORBOX unfortunately also prevents
"just-in-time debuggers" from working reliably, i.e. the ability to attach to
a process on crash. This prevents collecting crash dumps as part of CI.
The error popups are particularly problematic when they occur during automated
testing, as they can cause the tests to hang, waiting for a button to be
clicked.
This commit improves the error handling setup in startup_hacks() to address
those problems. SEM_NOGPFAULTERRORBOX is not used anymore, instead various
other APIs are used to disable popups and to redirect output to stderr where
possible.
While this improves the situation for postgres.exe, it doesn't address similar
issues in all the other executables. There currently is no codepath that's
called early on for all frontend programs.
I've tested that this prevents GUI popups and allows JIT debugging in case of
crashes due to:
- abort()
- assert()
- C runtime errors
- unhandled exceptions
both in debug and non-debug mode, on Win10 with MSVC 2019 and with MinGW.
Now that crash reports are generated on windows, collect them in windows CI.
Discussion: https://postgr.es/m/20211005193033.tg4pqswgvu3hcolm@alap3.anarazel.de
On windows ci/cfbot currently regularly hangs / times out. Presumably this is due
to the issues discussed in
https://postgr.es/m/CA%2BhUKG%2BG5DUNJfdE-qusq5pcj6omYTuWmmFuxCvs%3Dq1jNjkKKA%40mail.gmail.com
which lead to reverting 75674c7ec1 everywhere but master.
But it's hard to tell - because the entire test task times out, we don't get
to see debugging information.
This commit adds a timeout (using git's timeout binary) to all vcregress
invocations, which should address the problem for now. It might also be a good
idea to add timeouts to the other operating systems at a later time.
The diff is a bit larger than one might think necessary: Yaml doesn't like % -
from the windows command variable syntax - at the start of an unquoted
string...
Discussion: https://postgr.es/m/20220110005704.es4el6i2nxlxzwof@alap3.anarazel.de
The build summary was disabled unintentionally by setting the build verbosity
lower.
While at it, also add ForceNoAlign to prevent msbuild from introducing
linebreaks in the middle of filenames etc - they make it harder to copy
output.
Discussion: https://postgr.es/m/20220113175554.u6gw7olrdfzivl3n@alap3.anarazel.de
Currently FreeBSD, Linux, macOS and Windows (Visual Studio) are tested.
The main goal of this integration is to make it easier to test in-development
patches across multiple platforms. This includes improving the testing done
automatically by cfbot [1] for commitfest entries. It is *not* the goal to
supersede the buildfarm.
cirrus-ci [2] was chosen because it was already in use for cfbot, allows using
full VMs, has good OS coverage and allows accessing the full test results
without authentication (like a github account). It might be worth adding
support for further CI providers, particularly ones supporting other git
forges, in the future.
To keep CI times tolerable, most platforms use pre-generated images. Some
platforms use containers, others use full VMs.
For instructions on how to enable the CI integration in a repository and
further details, see src/tools/ci/README
[1] http://cfbot.cputube.org/
[2] https://cirrus-ci.org/
Author: Andres Freund <andres@anarazel.de>
Author: Thomas Munro <tmunro@postgresql.org>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: Thomas Munro <tmunro@postgresql.org>
Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20211001222752.wrz7erzh4cajvgp6@alap3.anarazel.de