220 Commits

Author SHA1 Message Date
Dennis Klein
7e63d4ae9a test: drop thread-sanitizer suppressions
- every entry stood in for a library tsan could not see into; with libzmq,
  libsodium and libstdc++ now tsan-instrumented in the tsan CI job, the
  happens-before edges they establish are visible and nothing is left to
  suppress
- suppressions were blunt (a race: entry matches any frame in the stack),
  so they could also mask real races passing through those frames
2026-06-10 19:31:19 +02:00
Dennis Klein
add85cb18d ci: add tsan spack environment with instrumented libzmq
- mirror spack-latest.yaml, with -fsanitize=thread on the libzmq and
  libsodium nodes so tsan can observe the happens-before edges established
  inside libzmq's lock-free queues, plus the libstdcxx-tsan root spec
- flags are applied per node instead of via the propagating '==' operator,
  which could reach the gcc node and trigger a compiler rebuild
- unchanged roots (fairlogger, boost, ninja, cmake) keep their spec hashes,
  so they are shared with the regular buildcache entries; the instrumented
  nodes hash differently and coexist in the content-addressed cache
- exclude libstdcxx-tsan from concretizer reuse so recipe changes always
  take effect; unchanged recipes still hit the buildcache because the spec
  hash is identical
- add the tsan env to the buildcache matrix (rebuilding also on spack_repo
  changes) so the instrumented binaries are cached instead of rebuilt on
  every CI run
2026-06-10 19:31:19 +02:00
Dennis Klein
331c50ab0e ci: add spack package for a tsan-instrumented libstdc++
- gcc ships no supported switch to build libstdc++ with -fsanitize=thread,
  and spack's gcc recipe filters all flags out of the target-library build
  (CXXFLAGS_FOR_TARGET is owned by its generated --with-build-config=spack
  makefile), so provide a dedicated libstdcxx-tsan package in a custom repo
- build only the libstdc++-v3 subtree from the matching gcc release tarball,
  configured standalone against the already-installed toolchain (recipe
  modeled on https://iree.dev/developers/debugging/sanitizers/), instead of
  rebuilding all of gcc
- the result is a drop-in runtime replacement for the compiler's libstdc++
  (same soname and symbol versions), to be loaded only by the instrumented
  test executables
- normalize the install layout after make install: the standalone build puts
  the runtime libraries into the multilib os dir (lib64 on x86_64) regardless
  of --libdir, and --with-toolexeclibdir only applies to cross builds
- register the repo in the setup-deps action before creating the env
2026-06-10 19:31:19 +02:00
Dennis Klein
b568535910 test: support an alternative runtime library dir per test
- introduce FAIRMQ_TEST_LD_LIBRARY_PATH, which prepends a directory to
  each test's environment via ctest, so the tests can run against an
  alternative runtime library (e.g. a tsan-instrumented libstdc++)
- LD_LIBRARY_PATH rather than an injected rpath: an rpath added via the
  linker flags cannot precede the rpath spack's gcc adds through its
  specs file, so the compiler's own libstdc++ would keep winning the
  runtime search order
- scoped per test on purpose: an instrumented library has unresolved
  __tsan_* symbols and must not be loaded into uninstrumented tools
  like cmake, ctest or ninja
- fail the configuration instead of silently dropping the injection on
  CMake < 3.22 (ENVIRONMENT_MODIFICATION)
- cover the example tests too; they share the instrumented runtime but
  not the locale-cache warmup (their main() is the installed public
  header). The custom-controller env block was dead before: it tested
  lsan_options, which only ever existed in the add_example() function
  scope, so the test also never received the LSan suppressions
2026-06-10 19:31:19 +02:00
Dennis Klein
2bd9a072a9 test: pre-fill libstdc++ ctype caches before threads exist
- std::ctype<char> caches narrow()/widen() results per character in
  plain char arrays of the global classic-locale facet, written without
  synchronization from header-inlined code (locale_facets.h); two
  threads exercising an uncached character concurrently (e.g. compiling
  a std::regex in Channel::Validate) constitute a true data race that
  ThreadSanitizer rightfully reports
- the stores are real and unsynchronized, so a tsan-instrumented
  libstdc++ cannot help here; instead fill the caches before any thread
  is spawned, which turns every later access into a pure read
- warm the lazily-installed num_put/num_get caches used by stream
  insertion/extraction as well, via a small format/parse round-trip
- wire the warm-up into the gtest runner main() and, via a static
  initializer, into the test device runner
2026-06-10 19:31:19 +02:00
Dennis Klein
19e607e486 test: fix racy loop-variable capture in SubscriptionThreadSafety
- the subscriber threads captured the loop counter by reference while
  the spawning loop kept incrementing it: a genuine data race
- depending on timing, threads could also end up with duplicate
  subscriber names; capture the counter by value instead
2026-06-10 19:31:19 +02:00
Alexey Rybalchenko
215c31428b feat(shmem): expose side-channel metadata API for unsent messages
Add two public entry points needed by the ALICE use case where shmem
messages are allocated via a transport but never sent — their metadata
is instead serialised into Arrow tables and delivered over a separate
channel, allowing consumer devices to resolve the payload pointer
without taking ownership.

shmem::Message::GetMeta() returns the MetaHeader of the message,
mirroring the existing positional-init pattern already used in Socket.h.

shmem::GetDataAddressFromHandle(TransportFactory&, const MetaHeader&)
is a free function declared in Common.h and defined in Manager.cxx.
Keeping it out of the TransportFactory class body means callers only
need to include Common.h (available transitively via Message.h) and do
not drag in Socket.h or zmq.h. The implementation handles both managed
segments and unmanaged regions, and throws SharedMemoryError with a
typed message on a bad segment or region id. TransportFactory also
gains a same-named member for callers that already have the concrete
type. Lifetime of the returned pointer is the caller's responsibility;
the cache device is expected to hold the messages alive.

A SideChannel test covers the GetMeta/GetDataAddressFromHandle
round-trip for both standard and expanded-metadata configurations.
2026-06-10 18:51:04 +02:00
Dennis Klein
a0e8271aca test: suppress libzmq-induced thread-sanitizer false positives
- libzmq is not tsan-instrumented, so tsan cannot see the happens-before
  its queues establish between user threads and libzmq I/O threads,
  producing false-positive data races on message buffers
- add test/thread_sanitizer_suppressions.txt and point TSAN_OPTIONS at it
  via the sanitizers job env so it reaches the tests and their device
  subprocesses
- suppress: accesses made directly from libzmq, the zero-copy message
  deleters libzmq runs from msg_t::close, shmem receive-side metadata
  reads, and std::regex/locale lazy-init races in libstdc++
2026-06-09 23:00:58 +02:00
Dennis Klein
7a44c5e19e style: add braces around single-statement bodies
- wrap single-statement control-flow bodies in braces
- clang-tidy readability-braces-around-statements
  https://clang.llvm.org/extra/clang-tidy/checks/readability/braces-around-statements.html
2026-06-09 23:00:58 +02:00
Dennis Klein
0fd27cbbc3 ci: match renamed libzmq leak frame in lsan suppressions
- LSan symbolizes the leak as the C++ method `zmq::msg_t::init_size`
  in the Debug sanitizer build, no longer the C wrapper
  `zmq_msg_init_size`
- substring match failed (`_` vs `::`), so the suppression no longer
  applied and the asan+lsan+ubsan job failed in Pair/PubSub/Poller tests
- add the demangled frame, keep the old pattern for older libzmq
2026-06-09 23:00:58 +02:00
Dennis Klein
f374e228ff ci: cache gcc as a buildcache node instead of committed lockfiles
Committed lockfiles pinned gcc as a host-path external (from spack compiler
find), which is not portable across runners and broke CI. Cache the gcc
compiler itself as a buildcache node instead, so CI pulls it (~1 min) rather
than building it from source (~1 h).

- push the freshly-built gcc node in setup-deps BEFORE spack compiler find
  (which marks it external and excludes it from buildcache push), gated behind
  a push-gcc input used only by the buildcache workflow
- drop the committed-lockfile approach: remove test/ci/locks, the lockfile
  install path in setup-deps, and the lockfile export in the buildcache workflow
- drop the ignored ref input from setup-spack (v3 renamed it to spack_ref)
2026-06-08 23:04:29 +02:00
Dennis Klein
bb5c0a998c ci: install deps from committed lockfiles when present
Reusing concretization between the weekly buildcache (fresh) and weekday CI
(reuse) can drift if runner externals change, causing avoidable cache misses.

- setup-deps installs from test/ci/locks/<env>-gcc<N>.lock when it exists,
  skipping concretization for byte-identical hashes; falls back to the spec
  yaml otherwise
- buildcache exports each env's spack.lock as a downloadable artifact so the
  lockfiles can be regenerated on the ubuntu-24.04 runner and committed
- document the manual regeneration flow in test/ci/locks/README.md
2026-05-31 21:15:44 +02:00
Dennis Klein
fa64faf3f7 fix(boost): add compatibility for Boost.Process v1 API in Boost 1.89+
Boost 1.88 replaced Boost.Process with v2, breaking the v1 API.
Boost 1.89 restores v1 compatibility via <boost/process/v1.hpp>.

- Fail configuration if Boost 1.88 is detected
- Define FAIRMQ_BOOST_PROCESS_V1_HEADER for Boost >= 1.89
- Use conditional includes to select v1.hpp or process.hpp
- Add namespace aliases (bp, bp_this) for portable API access
2026-01-05 14:11:19 +01:00
Dennis Klein
642a4e06f0 ci: force generic x86_64_v3 target for all packages 2025-11-30 20:48:13 +01:00
Dennis Klein
a422361ee9 ci: set x86_64_v3 target for consistent buildcache 2025-11-30 20:10:50 +01:00
Dennis Klein
d1fbe4e89a ci: add named spack environments with boost187 variant 2025-11-30 19:58:19 +01:00
Dennis Klein
399170879c ci: improve buildcache workflow
- rename mirror to ghcr-buildcache
- find system compiler before building gcc
- separate update-index job to avoid race condition
- always attempt push even on partial failure
2025-11-30 15:51:13 +01:00
Dennis Klein
1392a31250 ci: fix OCI registry authentication for buildcache push 2025-11-30 14:50:02 +01:00
Dennis Klein
7a8ccb8df6 ci: migrate from Jenkins to GitHub Actions 2025-11-28 21:13:30 +01:00
Dennis Klein
dcea48fcee fix: parse errors
```
/test/memory_resources/_memory_resources.cxx: In member function ‘virtual void {anonymous}::MemoryResources_allocator_Test::TestBody()’:
/test/memory_resources/_memory_resources.cxx:104:12: error: parse error in template argument list
  104 |     config.SetProperty<string>("session", to_string(session));
      |            ^~~~~~~~~~~~~~~~~~~
/test/memory_resources/_memory_resources.cxx:104:31: error: no matching function for call to ‘fair::mq::ProgOptions::SetProperty<<expression error> >(const char [8], std::string)’
  104 |     config.SetProperty<string>("session", to_string(session));
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/dklein/projects/FairMQ2/test/memory_resources/_memory_resources.cxx:11:
/fairmq/ProgOptions.h:269:6: note: candidate: ‘template<class T> void fair::mq::ProgOptions::SetProperty(const std::string&, T)’
  269 | void fair::mq::ProgOptions::SetProperty(const std::string& key, T val)
      |      ^~~~
/fairmq/ProgOptions.h:269:6: note:   template argument deduction/substitution failed:
/test/memory_resources/_memory_resources.cxx:104:31: error: template argument 1 is invalid
  104 |     config.SetProperty<string>("session", to_string(session));
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/test/memory_resources/_memory_resources.cxx: In member function ‘virtual void {anonymous}::MemoryResources_getMessage_Test::TestBody()’:
/test/memory_resources/_memory_resources.cxx:132:12: error: parse error in template argument list
  132 |     config.SetProperty<string>("session", to_string(session));
      |            ^~~~~~~~~~~~~~~~~~~
/test/memory_resources/_memory_resources.cxx:132:31: error: no matching function for call to ‘fair::mq::ProgOptions::SetProperty<<expression error> >(const char [8], std::string)’
  132 |     config.SetProperty<string>("session", to_string(session));
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/fairmq/ProgOptions.h:269:6: note: candidate: ‘template<class T> void fair::mq::ProgOptions::SetProperty(const std::string&, T)’
  269 | void fair::mq::ProgOptions::SetProperty(const std::string& key, T val)
      |      ^~~~
/fairmq/ProgOptions.h:269:6: note:   template argument deduction/substitution failed:
/test/memory_resources/_memory_resources.cxx:132:31: error: template argument 1 is invalid
  132 |     config.SetProperty<string>("session", to_string(session));
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
2025-06-13 08:17:53 +02:00
Dennis Klein
c80f97b338 fix(tools): No longer use removed query API
Deprecated via 74fe2b8e14
and removed via e916bdfb1a
in Boost 1.87 or Asio 1.33.
2025-01-09 17:09:57 +01:00
Alexey Rybalchenko
39cb021827 Add 'no control' controller 2024-02-19 22:09:54 +01:00
Alexey Rybalchenko
2df3d909fa shm: when refCount segment size is zero, fallback to old behaviour
, which is to store reference counts inside the main data segment
2023-11-29 19:21:42 +01:00
Dennis Klein
961eca5276 test(PluginServices): state change subscription thread-safety 2023-11-10 13:13:13 +01:00
Alexey Rybalchenko
6122010694 Fix address clashes in tests 2023-10-24 15:22:21 +02:00
Alexey Rybalchenko
4310d07ed1 deduplicate ipc address in a test 2023-09-29 11:18:24 +02:00
Alexey Rybalchenko
25614e3e06 test: Add coverage for --shm-metadata-msg-size 2023-06-13 21:24:40 +02:00
Alexey Rybalchenko
3decac58fc test: Add data transfer and checks to protocol tests 2023-06-13 21:24:40 +02:00
Dennis Klein
c8fde17b6a ci: Silence lsan hits in libzmq 2023-03-06 15:32:48 +01:00
Dennis Klein
05b734ee0d feat!: Migrate to std::filesystem consistently 2023-03-06 15:32:48 +01:00
Dennis Klein
0aecfff133 feat(plugins)!: Remove PMIx plugin 2023-03-02 11:20:35 +01:00
Dennis Klein
2e98a4e2cb feat(ofi)!: Remove ofi transport
BREAKING CHANGE

Due to a lack of users, we remove the experimental code. The
latest implementation can be found in release v1.4.56. This does
not mean it will never be picked up again, but for now there are
no plans.
2023-03-02 11:20:35 +01:00
Dennis Klein
c35d35a3c3 feat!: Remove Device::TransitionTo() without replacement
BREAKING CHANGE

However, this API was never advertised nor used by anyone.
2023-03-01 15:39:38 +01:00
Dennis Klein
c2fa2e8848 test: Deduplicate code and fix [-Wunused-result] 2023-03-01 15:39:38 +01:00
Dennis Klein
b25c0787c0 test: Fix [-Wunused-result] 2023-03-01 15:39:38 +01:00
Dennis Klein
84de22f80b test: Consolidate some device control logic 2023-03-01 15:39:38 +01:00
Dennis Klein
435d07eaf9 feat: Improve ChangeState API
* Add `[[nodiscard]]` to `bool Device::ChangeState()`
* Introduce throwing variant `void Device::ChangeStateOrThrow()`

resolves #441
2023-03-01 15:39:38 +01:00
Alexey Rybalchenko
45663189a9 Turn shm-monitor off by default
resolves #459
2023-02-24 14:28:18 +01:00
Dennis Klein
a58b4870d7 feat(Parts): Refine and tweak
* Optimize appending another Parts container
* Remove redundant/verbose comments
* Change r-value args to move-only types into l-value args for
  readability
* Deprecate `AtRef(int)`, redundant, just dereference at call site
* Deprecate `AddPart(Message*)`, avoid owning raw pointer args
* Add various const overloads
* Add `Empty()` and `Clear()` member functions
* Add `noexcept` where applicable
2023-02-24 13:59:27 +01:00
Alexey Rybalchenko
ac661dfd63 Add test for externally (outside the session) created shmem region 2022-10-05 09:13:37 +02:00
Dennis Klein
9a51c7b5fb ci: Update and use images from ghcr.io/fairrootgroup/fairmq-dev 2022-08-12 01:50:14 +02:00
Dennis Klein
ca420a0e0d feat(plugins): Allow kebab-case plugin names, e.g. libfairmq-plugin-pmix
Camel+snake-case plugin names are still allowed! e.g. `libFairMQPlugin_pmix`
2022-08-11 15:30:25 +02:00
Dennis Klein
b798b1e098 test: Increase robustness of the test suite for high -j 2022-08-11 15:30:25 +02:00
Dennis Klein
ac1904661a test(channel): Increase sleep time
The logic of the GetNumberOfConnectedPeers test case relies on sleeping
a certain time. We have observed the 10ms sleep time to sometimes be too
short. Increasing it to 100ms should improve test stability.
2022-08-11 15:30:25 +02:00
Dennis Klein
12a85c6fb1 fix: Use namespaced typenames/headers 2022-08-11 15:30:25 +02:00
Dennis Klein
cda7282422 feat!: Remove deprecated components sdk, sdk_commands, dds_plugin
BREAKING CHANGE: Components have been moved to ODC project, see
https://github.com/FairRootGroup/FairMQ/discussions/392 for details.
2022-08-11 15:30:25 +02:00
Dennis Klein
eb9ddc81cf ci: Run thread sanitizer with clang++ 2022-03-21 16:28:43 +01:00
Dennis Klein
3b2ad1f6f4 ci: Add Fedora 35 build 2022-03-21 16:28:43 +01:00
Alexey Rybalchenko
b747a8787c shm: check region size when opening existing 2022-02-08 09:09:25 +01:00
Alexey Rybalchenko
5f33401d41 Parallelize more tests 2022-01-25 11:55:38 +01:00