Raymond P. Burkholder - Things I Do - Software Development

C++ Header File Statistics

nospam@example.com (Raymond P. Burkholder) — Sun, 04 Feb 2024 14:02:56 +0000

From #include <rules>, via hacker news, two compile time options to consider:

use the preprocessor output (cl /E, gcc -E)
use the include output (cl /showIncludes, gcc -M), gather the codebase statistics (average size after preprocessing, most included header files, header files with largest payload, etc.)

I've been doing this backwards, but I don't understand why though:

The header file named after the source should be included first (to catch errors in the header)

Boehm Garbage Collection, Cords String Handling

nospam@example.com (Raymond P. Burkholder) — Sun, 21 Jan 2024 16:39:38 +0000

HackerNews had an article about Boehm-Demers-Weiser conservative C/C++ Garbage Collector which leads to A garbage collector for C and C++.

It can be used in garbage collection mode or leak detection mode.

The garbage collector distribution includes a C string (cord) package that provides for fast concatenation and substring operations on long strings. A simple curses- and win32-based editor that represents the entire file as a cord is included as a sample application. From Wikipedia:

Boehm GC is also distributed with a C string handling library called cords. This is similar to ropes in C++ (trees of constant small arrays), but instead of using reference counting for proper deallocation, it relies on garbage collection to free objects. Cords are good at handling very large texts, modifications to them in the middle, slicing, concatenating, and keeping history of changes (undo/redo functionality).

Code can be found at github - The Boehm-Demers-Weiser conservative C/C++ Garbage Collector (bdwgc, also known as bdw-gc, boehm-gc, libgc)

Parallelism and Concurrency

nospam@example.com (Raymond P. Burkholder) — Wed, 10 Jan 2024 03:47:38 +0000

HPX -- An open source C++ Standard Library for Parallelism and Concurrency

To achieve scalability with today's heterogeneous HPC resources, we need a dramatic shift in our thinking; MPI+X is not enough. Asynchronous Many Task (AMT) runtime systems break down the global barriers imposed by the Bulk Synchronous Programming model. HPX is an open-source, C++ Standards compliant AMT runtime system that is developed by a diverse international community of collaborators called The Ste||ar Group. HPX provides features which allow application developers to naturally use key design patterns, such as overlapping communication and computation, decentralizing of control flow, oversubscribing execution resources and sending work to data instead of data to work. The Ste||ar Group comprises physicists, engineers, and computer scientists; men and women from many different institutions and affiliations, and over a dozen different countries. We are committed to advancing the development of scalable parallel applications by providing a platform for collaborating and exchanging ideas. In this paper, we give a detailed description of the features HPX provides and how they help achieve scalability and programmability, a list of applications of HPX including two large NSF funded collaborations (STORM, for storm surge forecasting; and STAR (OctoTiger) an astro-physics project which runs at 96.8% parallel efficiency on 643,280 cores), and we end with a description of how HPX and the Ste||ar Group fit into the open source community.

TimeGraphs: Graph-based Temporal Reasoning

Many real-world systems exhibit temporal, dynamic behaviors, which are captured as time series of complex agent interactions. To perform temporal reasoning, current methods primarily encode temporal dynamics through simple sequence-based models. However, in general these models fail to efficiently capture the full spectrum of rich dynamics in the input, since the dynamics is not uniformly distributed. In particular, relevant information might be harder to extract and computing power is wasted for processing all individual timesteps, even if they contain no significant changes or no new information. Here we propose TimeGraphs, a novel approach that characterizes dynamic interactions as a hierarchical temporal graph, diverging from traditional sequential representations. Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales. Adopting a self-supervised method, TimeGraphs constructs a multi-level event hierarchy from a temporal input, which is then used to efficiently reason about the unevenly distributed dynamics. This construction process is scalable and incremental to accommodate streaming data. We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset. The results demonstrate both robustness and efficiency of TimeGraphs on a range of temporal reasoning tasks. Our approach obtains state-of-the-art performance and leads to a performance increase of up to 12.2% on event prediction and recognition tasks over current approaches. Our experiments further demonstrate a wide array of capabilities including zero-shot generalization, robustness in case of data sparsity, and adaptability to streaming data flow.

How to do a dynamic parser in X3?

nospam@example.com (Raymond P. Burkholder) — Sun, 24 Sep 2023 00:26:51 +0000

how to do a dynamic parser in X3?

The presentation went well, but as always with a large surface area library like Spirit it went on for too long IMO (2 hrs). This was similar to what happened when I presented Spirit 2.x almost 10 yrs ago. If I were to present on this again, I would split the topic into two talks:

A background on recursive variant datatypes and visitor pattern and dipping into the Fusion library for adapting arbitrary application structures to Fusion tuples.
Spirit X3 parsing w/annotations and error handling.

I try to present from real code in the IDE as much as possible so that we're looking at real code. I almost always write my own examples with the library I'm presenting and put them on github. The exception is when the library comes with an example that is everything I want to talk about .
I cribbed quite a bit from the spirit x3 fun example presented by Michael Caisse a few years back, but I did extend the AST to handle imaginary numbers.

SaltStack on Debian Bookworm

nospam@example.com (Raymond P. Burkholder) — Wed, 19 Jul 2023 01:09:49 +0000

I found out the hard way that SaltStack and Debian no longer place nice together. I had upgraded a Debian installation from Bullseye to Bookworm, along with the resident Salt Minion. When attempting to use the minion, it no longer starts up, due to various imports no longer working. Which was due to the salt-minion not being upgraded. The error message would started this odyssey:

salt ImportError: cannot import name 'Markup' from 'jinja2'

Taking a look at the Debian Developer Information for Salt, the last version started in 'unstable' was 3004.1 back in December of 2022. This is now almost 8 months later and little or no movement. There was some mention in a ticket somewhere that Salt release cycles don't cater to Debian stable release cycles. Not sure if that is a legitimate reason or not, but, well, for whatever reason, SaltStack management in Debian is no longer a simple no brainer.

However, after a little digging, there is a way to run SaltStack versions 3006 (current as of this writing). It is simple to install on Bullseye, but not easily done on Bookworm.

On Bullseye (as root, or implies sudo):

# cd ~ # apt remove salt-minion salt-master # apt install curl # curl -L https://bootstrap.saltstack.com -o install_salt.sh # sh install_salt.sh -M onedir

The '-M' installs the salt master at the same time (for machines running master). If you forget to do that, you'll need to diagnose and fix the systemctl mask error with the following:

# apt install file # file /etc/systemd/system/salt-master.service # rm /etc/systemd/system/salt-master.service # systemctl daemon-reload # sh install_salt.sh -M onedir

The 'sh install_salt.sh -M onedir' should show a symlink to /dev/nul, which the 'rm ...' will fix.

On Bookworm, the bootstrap isn't scheduled to work till beginning of 2024 sometime I think with Salt 3007 or 3008 -- more info in [FEATURE REQUEST] Add Salt support for Debian 12 #64223 .

In the meantime, I had to cheat a bit:

in /etc/debian_version, change 12.0 to 11.0
in /etc/apt/sources.list, change bookworm to bullseye
rm /etc/apt/sources.list.d/salt.list
run apt update
run the commands listed above for installing the one or both the salt services
restore /etc/debian_version and /etc/apt/sources.list to their original content

I'm sure there are more elegant ways of doing this, but this worked to fake the needed version 11 in the installation script and directory traversal requirements

Note, more info on the Salt Install/Bootstrap Process.

Stop VSCode from adding headers

nospam@example.com (Raymond P. Burkholder) — Sun, 11 Jun 2023 18:04:01 +0000

Visual Studio Code, by default, when using the clangd language server, will automatically insert headers files for types which may have already been declared.

To disable this, go into the configuration options for clangd, and in 'clangd:arguments' add:

--header-insertion=never

Source: Stop VSCode from adding redundant headers

GCC Optimization for Native Architecture

nospam@example.com (Raymond P. Burkholder) — Sun, 27 Mar 2022 17:48:33 +0000

When using Cern's ROOT Data Analysis Framework, there are notes about possible compile time enhancements. It is best to take a look on which cpu variant the compiler will generate code, and for which target cpu the code will run. They need to be compatible.

AVX-512 is an interesting Wikipedia page describing the various cpu variant incarnations and their specialized instruction subsets.

Here are a few command line examples for examining what the gcc compiler sees as being available.

$ gcc -march=native -Q --help=target | grep march
  -march=                               skylake
  Known valid arguments for -march= option:

$ echo | gcc -dM -E - -march=native
  ... large quantity of flags ...

gcc x86 cpu options cmake command:

add_definitions(-march=native)

Even though this optimizes instruction use, it has to be used carefully. It will probably cause side effects with compiling code on one version of cpu and when copying the executables to a different machine, with possibly a different cpu variant.

Reference: How to see which flags -march=native will activate?

C++ References

nospam@example.com (Raymond P. Burkholder) — Sun, 20 Mar 2022 12:29:16 +0000

Tutorial: When to Write Which Special Member - "When explaining someone the rules behind the special member functions and when you need to write which one, there is this diagram that is always brought up. I don’t think the diagram is particularly useful for that, however."

Tutorial: Managing Compiler Warnings with CMake - " But how do you manage the very compiler-specific flags in CMake? How do you prevent your header files from leaking warnings into other projects?"

Papers 2022/02/05

nospam@example.com (Raymond P. Burkholder) — Sat, 05 Feb 2022 22:44:45 +0000

Pipeflow: An Efficient Task-Parallel Pipeline Programming Framework using Modern C++

Pipeline is a fundamental parallel programming pattern. Mainstream pipeline programming frameworks count on data abstractions to perform pipeline scheduling. This design is convenient for data-centric pipeline applications but inefficient for algorithms that only exploit task parallelism in pipeline. As a result, we introduce a new task-parallel pipeline programming framework called Pipeflow. Pipeflow does not design yet another data abstraction but focuses on the pipeline scheduling itself, enabling more efficient implementation of task-parallel pipeline algorithms than existing frameworks. We have evaluated Pipeflow on both micro-benchmarks and real-world applications. As an example, Pipeflow outperforms oneTBB 24% and 10% faster in a VLSI placement and a timing analysis workloads that adopt pipeline parallelism to speed up runtimes, respectively.

3D Modelling Software - C++ API

nospam@example.com (Raymond P. Burkholder) — Sat, 16 Jan 2021 20:30:24 +0000

Since none of this is registered in github, here is my alternate mechanism for 'staring' these items:

FreeCAD - open-source parametric 3D modeler made primarily to design real-life objects of any size. Parametric modeling allows you to easily modify your design by going back into your model history and changing its parameters. You get modern Finite Element Analysis (FEA) tools, experimental CFD, dedicated BIM, Geodata or CAM/CNC workbenches, a robot simulation module that allows you to study robot movements and many more features.
cppyy: Automatic Python-C++ bindings - an automatic, run-time, Python-C++ bindings generator, for calling C++ from Python and Python from C++
Open CASCADE Technology, The Open Source 3D Modeling Libraries - is a software development kit (SDK) intended for development of applications dealing with 3D CAD data or requiring industrial 3D capabilities. It includes a set of C++ class libraries providing services for 3D surface and solid modeling, CAD data exchange, and visualization. tutorial
OpenStudio is a cross-platform (Windows, Mac, and Linux) collection of software tools to support whole building energy modeling using EnergyPlus and advanced daylight analysis using Radiance.
Topologic is a software development kit and plug-in that enables logical, hierarchical and topological representation of spaces and entities

Git Branching

nospam@example.com (Raymond P. Burkholder) — Mon, 28 Dec 2020 01:37:05 +0000

With some help from Learn Git Branching, here is are some 'Coles Notes':

Create a new branch:

git branch _branch_name_

Check out the new branch (for subsequent commits)

git checkout _branch_name_

Switch to the new branch (for subsequent commits, an alternative to checkout)

git switch _branch_name_

Create a new branch and check it out at the same time:

git checkout -b _branch_name_

Merging in Git creates a special commit that has two unique parents. A commit with two parents essentially means "I want to include all the work from this parent over here and this one over here, and the set of all their parents." To merge a branch into main/master, first checkout / switch to the primary branch.

git merge _branch_name_

Rebasing essentially takes a set of commits, "copies" them, and plops them down somewhere else. While this sounds confusing, the advantage of rebasing is that it can be used to make a nice linear sequence of commits. The commit log / history of the repository will be a lot cleaner if only rebasing is allowed.

git switch _branch_name_;git rebase master

To move back in time relative to a branch:

git checkout _branch_name_^

To move back in time relative to HEAD:

git checkout HEAD^

To move back in time relative a number of steps:

git checkout _branch_name_~2

Use relative refs to move a branch around:

git branch -f master HEAD~3

A git reset will move a branch backwards as if the commit had never been made in the first place. It is useful for local branches.

git reset HEAD~1

Git reset doesn't work for remote branches that others are using. In order to reverse changes and share those reversed changes with others, we need to use git revert.

git revert HEAD

To cherry-pick, means copy a series of commits below your current location (HEAD):

git cherry-pick _branch_name_1_ _branch_name_n_

Interactive rebase provides rebasing interactively to re-arrange and delete commits:

git rebase -i HEAD~4

HEAD can be modified with:

git commit --amend

Book: Is Parallel Programming Hard, And, If So, What Can You Do About It?

nospam@example.com (Raymond P. Burkholder) — Tue, 16 Jun 2020 04:15:41 +0000

Is Parallel Programming Hard, And, If So, What Can You Do About It? - The purpose of this book is to help you program shared-memory parallel machines without risking your sanity.

It provides in-depth analysis of parallel programming at the cpu level of programming. A downloadable, git versioned, pdf.

C++ Currency Library

nospam@example.com (Raymond P. Burkholder) — Sat, 09 May 2020 23:48:35 +0000

Some C++ libraries I've encountered for managing currency manipulation:

vpiotr / decimal_for_cpp - this is the one I use for my currency work, is header based, succinct, and easy to use.
I was wondering if boost.multiprecision could be used, and came up with mariusbancila / moneycpp - a bit too much for me as it also handles currency codes. There is a a blog entry: moneycpp – a C++ library for handling monetary values.
General Decimal Arithmetic goes into some detail about the related IEEE 754 standard, as well as a link to a C language implementation.

VIM with CMake and clangd Enhancements

nospam@example.com (Raymond P. Burkholder) — Mon, 23 Dec 2019 18:43:47 +0000

In my search for an editing environment, I am currently running Visual Studio Code with a number of extensions. It is a GUI based IDE environment and seems to be relatively performant. When jumping to symbol declarations and definitions, it can take a few seconds while performing the lookup.

Since I also use VIM for editing files, I have been keeping my eye on VIM add-ons for creating a console based IDE. Some tools which might be applicable:

clangd - the language server protocol daemon for clang
Using LSP & clangd in Vim - notes on integrating clangd, and with a reference at the end about the JSON compilation database, (which in another link, can be supplied via CMake)
vim cp enhanced highlight - with clangd, I am not sure if this is required, but helps with syntax high-lighting without clangd
vim-cpp-modern: Enhanced C and C++ syntax highlighting - another syntax highligher, one is forked from the other, with this one being more current, I believe
VIM CMake - plugin to make working with CMake a little nicer
CMake Support for Vim - CMake Interoperability in Vim - not sure what the difference is to the previous link
Vim Cmake integration - some chatter on StackOverflow on integrating CMake with VIM

NeoVim

Visual Studio Code with CMake and Clangd

nospam@example.com (Raymond P. Burkholder) — Sun, 22 Dec 2019 03:40:30 +0000

Continuing on in the search for a responsive IDE for building large C++ codesets with CMake, the next stop is Visual Studio Code.

The initial install supplies a basic IDE editing environment.

Usability occurs when extensions are loaded. I have the following installed (versions are as of this writing):

C/C++ for Visual Studio Code, v0.26, by Microsoft <== some sites say to disable this, but it is required by the debug function
CMake, v0.0.17, by twxs, which provides CMake language support
CMake Tools, v1.2.3, by Microsoft, which provides the build capability
vscode-clangd, v0.0.19, is the clang language server, which provides code completion and symbol tables

Prior to installation of the extensions, on Debian, I installed the following packages first (in addition to the standard compilers):

cmake
clangd
clangd-9

reddit - PROTIP: Use clangd instead of the official C/C++ extension suggests adding the following to .vscode/settings.json:

"C_Cpp.autocomplete": "Disabled",
"C_Cpp.formatting": "Disabled",
"C_Cpp.errorSquiggles": "Disabled",
"C_Cpp.intelliSenseEngine": "Disabled",

In addition, I have a more recent version of clangd, so in .vscode/settings.json, I also have:

"clangd.path": "/usr/lib/llvm-9/bin/clangd",

My over all file looks like:

{
  "clangd.path": "/usr/lib/llvm-9/bin/clangd",
  "C_Cpp.autocomplete": "Disabled",
  "C_Cpp.formatting": "Disabled",
  "C_Cpp.errorSquiggles": "Disabled",
  "C_Cpp.intelliSenseEngine": "Disabled",
  "C_Cpp.default.intelliSenseMode": "clang-x64",
  "editor.tabSize": 2
}

In the root directory of the project, a link is required (the setting "compileCommands" in .vscode/c_cpp_properties.json does not appear to be used by the clangd module) :

ln -s build/compile_commands.json .

2023/11/05 - Or add the following to .vscode/settings.json (which causes the file to be created and referenced directly in the basic workspace directory):

"cmake.copyCompileCommands": "${workspaceFolder}/compile_commands.json"

As a note, if compile_commands.json is not provided, then the following flag needs to be passed to CMake (seems to be defaulted "on" in current Visual Studio Code):

-DCMAKE_EXPORT_COMPILE_COMMANDS=1

Documentation for clangd is found at Getting started with clangd.