FPBench Logo

Community Tools

Floating-point Research Tools

The floating-point research community has developed many tools that authors of numerical software can use to test, analyze, or improve their code. They are listed here in alphabetical order and categorized into three general types: 1) dynamic tools, 2) static tools, 3) commercial tools, and 4) miscellaneous. You can find tutorials for some of these tools at fpanalysistools.org, and many of the reduced-precision tools were compared in a recent survey by authors from the Polytechnic University of Milan.

Tools on the list are not endorsed, and need not endorse or support the FPBench project. The list is intended to be exhaustive.

Dynamic Tools

ADAPT (Automatic Differentiation Applied to Precision Tuning) is a tool developed by Mike Lam from JMU and Harshitha Menon from LLNL. It uses the reverse mode of algorithmic differentiation to determine how much precision is needed in a program's inputs and intermediate results in order to achieve a desired accuracy in its output, converting this information into precision recommendations.
AMP (Automated Mixed Precision) is a tool developed by Ralph Nathan and Daniel Sorin (Duke University) for mixed-precision analysis. It begins with a single-precision version of a program and uses a variety of metrics and heuristics to gradually raise certain portions of the program to higher precision in order to achieve the desired accuracy in the outputs.
AMPT-GA is a tool developed by a multi-institutional team for automated source-level mixed-precision for GPU applications, combining various static and dynamic analysis approaches.
Atomu is a tool developed by Daming Zou (Peking University) that uses "atomic" condition numbers of individual floating-point operations to estimate the introduced error. It also performs an evolutionary search to find inputs that trigger the highest condition number for each operation. The instrumentation portion of the tool is implemented as an LLVM pass.
CADNA (Control of Accuracy and Debugging for Numerical Applications) estimates round-off errors by replacing floating-point arithmetic with discrete stochastic arithmetic (DSA), essentially modeling error using randomized rounding (up or down) instead of round-to-nearest-even. A new tool called PROMISE (PRecision OptiMISE) uses this technique to perform a delta-debugging search and report mixed precisions for C and C++ programs.
CRAFT (Configurable Runtime Analysis for Floating-Point Tuning) was originally a framework based on the Dyninst binary analysis toolkit to perform various dynamic floating-point analyses, including cancellation detection, dynamic range tracking, and automated mixed-precision analysis. CRAFT can perform an automated search of a program's instruction space, determining the level of precision necessary in the result of each instruction to pass a user-provided verification routine assuming all other operations are done in double precision. This gives a general overview of the approximate precision requirements of the program, providing insights and guidance for mixed-precision implementation.
There is a branch of CRAFT by Andrew Shewmaker (OpenEye Scientific, formerly LANL) that adds a new analysis mode for extracting histograms of all floating-point values encountered during a program run. He has also been building a variant of this tool based on SHVAL (described below).
FloatSmith is an integration project that provides automated source-to-source tuning and transformation of floating-point code for mixed precision using three existing tools: 1) CRAFT, 2) ADAPT, and 3) TypeForge. The primary search loop is guided by CRAFT and uses TypeForge to prototype mixed-precision versions of an application. ADAPT results can optionally be used to help narrow the search space, and there are a variety of options to customize the search.
FPDetect is a tool from the University of Utah for synthesizing error detectors for stencil applications. It begins with an offline phase where affine analysis is used to tighly estimate the floating-point precision preserved across computations for intervals of data points. This bound along with an auxiliary direct evaluation technique is then utilized to instantiate optimally situated detectors across the compute space when executing online. Violations of this bound can be attributed with certainty to errors such as soft-errors or logical bugs.
FPDiff is a tool by Jackson Vanover (UC Davis) that performs automated differential testing on mathematical library source code, extracting numerical function signatures, synthesizing drivers, creating equivalence classes of synonymous functions, and executing tests to detect meaningful numerical discrepancies between implementations.
FPED is a dynamic analysis that detects arithmetic expression errors using MPFR and test set generation.
FPGen is a tool by Hui Guo (UC Davis) that uses symbolic execution to generate inputs that maximize numerical error, finding more errors than S3FP or KLEE.
fpPrecisionTuning is a combination of two Python tools (C2mpfr and DistributedSearch) that together can be used to find mixed-precision versions of C programs. Like CRAFT and Precimonious, this tool performs a search over the mixed-precision search space using a user-given error bound, but this tool uses MPFR and source code modification to simulate non-standard precisions. The authors have applied this toolchain to a variety of examples from different domains including fixed-precision computations on an FPGA.
FPSpy is a tool by Peter Dinda (Northwestern) that uses several different forms of runtime analysis to detect and measure various floating-point issues. It is implemented as an LD_PRELOAD library and works on unmodified x64 executables, monitoring floating-point exceptions and other events of interest to report results with varying levels of granularity and overhead.
GPUMixer is a tool written by a multi-institutional team for LLVM-based mixed-precision analysis that is performance-focused, looking for potential speedups first and then verifying their accuracy afterwards.
Herbie automatically improves the accuracy of floating-point expressions. Herbie has been used to fix two bugs in math.js, and it also provides a web interface.
The authors of Daisy and Herbie have worked together to combine their tools:
Herbgrind is a dynamic analysis tool for binaries (built on Valgrind) to help find the root cause of floating-point error in large programs.
HiFPTuner is an extension of Precimonious (see below) that uses dependence analysis and edge profiling to enable a more efficient hierarchical search algorithm for mixed-precision configurations.
pLiner is a tool for automatically identifying code that will trigger compiler-induced numerical variability. It works by combining selective precision enhancements and a guided dynamic search, decreasing the amount of time required to detect issues.
PositDebug / FPSanitizer
PositDebug and FPSanitizer are tools from the Rutgers Architecture and Programming Languages group. Both use LLVM-based instrumentation and MPFR shadow values to check Posit (the former) and regular floating-point code (the latter). This approach is faster than the Valgrind-based instrumentation used by Herbgrind.
Precimonious is a tool written by Cindy Rubio González (currently UC Davis) that leverages the LLVM framework to tweak variable declarations to build and prototype mixed-precision configurations.
More recent work uses shadow value analysis to speed up the search.
PyFloT is a tool for mixed-precision tuning of applications using an instruction-centric analysis that uses call stack information and temporal locality to address the large scale of HPC scientific programs. PyFloT supports applications written in a combination of Python, CUDA, C++, and Fortran.
Shaman is a library that uses operator overloading and error-free transformations to track numerical error thorough computations by replacing floating-point numbers with instrumented alternatives. It can evaluate the number of significant digits of any floating-point number in a computation and has been designed with a focus on high-performance computating and simulation. It is compatible with mixed precision and parallelism and does not interfere with the control flow of the program.
SHVAL (SHadow Value Analysis Library) is a tool by Mike Lam that uses the Intel Pin instrumentation framework to perform full heavyweight shadow-value analysis for floating-point arithmetic, effectively allowing developers to simulate any real-numbered representation for which a plugin can be developed.
This work was extended by Ramy Medhat (U. of Waterloo) to perform per-address or per-instruction floating-point error analysis. Using this framework, he developed several mixed-precision versions of an image processing library, demonstrating how the tool can assist with making performance, energy, and accuracy tradeoffs in the embedded domain.
STOKE is a general stochastic optimization and program synthesis tool from Stanford that has been extended to handle floating-point computation. STOKE can trade off accuracy for performance for floating-point computations.
Verificarlo is a tool written by Pablo de Oliveira Castro (Université de Versailles St-Quentin-en-Yvelines), Eric Petit, and Christophe Denis (ENS Cachan). It leverages the LLVM framework for automated Monte Carlo Arithmetic analysis.
Verificarlo (see above) has recently been extended by Yohan Chatelain, including the addition of contextual information as well as a temporal visualization tool called VeriTracer.
Verrou is a Valgrind-based tool written by François Févotte and others from EDT in France. The tool can simulate choosing a random rounding mode for every operation, and performs a delta debugging search to find areas of code that fail a user-provided verification routine under this scheme. It can also use the Verificarlo front-end (see above) for lower overhead.

Static Tools

Alive / LifeJacket
Alive verifies that LLVM IR optimizations don't change program behavior, and includes support for floating-point instructions in an extension named LifeJacket. Alive has verified 43 LLVM optimizations and discovered eight incorrect ones.
Alive-NJ verifies that LLVM IR optimizations don't change program behavior, and includes support for floating-point instructions. Alive-NJ has found five bugs in LLVM optimization passes.
cohd / syhd
Laurent Thévenoux at LIP, ENS de Lyon has written tools for source-to-source error compensation and synthesis of floating-point code.
Daisy is another tool by Eva Darulova (the author of Rosa, see below) that combines various types of static and dynamic analysis of floating-point code and is designed to be modular and extensible. Eva also has a paper with Anastasiia Izycheva on computing sound relative errors. They find that this can produce much tighter error bounds than analyzing absolute error, and an implementation is included in Daisy. Finally, Eva has a paper with Einar Horn and Saksham Sharma on mixed-precision tuning in Scala and C, which is also included in Daisy.
The authors of Daisy and Herbie have worked together to combine their tools:
dco / scorpio
A group from CERTH & University of Thessaly and RWTH Aachen have written a significance analysis system using a combination of interval arithmetic and automatic differentiation. The software does not yet seem to be publicly available.
The Efficient SMT-based Bounded Model Checker (ESBMC) is an SMT-based model checker for single- and multi-threaded C/C++ programs with support for floating-point arithmetic.
Flocq is a floating-point formalization for Coq, written by Sylvie Boldo and others. It provides a library of theorems for analyzing arbitrary-precision floating-point arithmetic.
Floating-point stability analysis via SMT solving and theorem proving.
FPTaylor / FPTuner
A group at the University of Utah describes a way of using partial derivatives and symbolic Taylor series expansions to bound floating-point roundoff error; this achieves much tighter bounds than interval/affine/SMT methods. More recently, they have extended theird work by performing a broad comparison of many error bounding analyses.
They have also built a mixed-precision tool for Python code based on this technique. Note that it also requires the commercial Gurobi Optimizer.
Gappa helps automated reasoning about floating-point and fixed-point arithmetic. It has been used to prove the CRlibm implementations of elementary functions correct.
POP (Precision OPtimizer) is a tool written by Dorra Ben Khalifa (Laboratory of Mathematics and Physics, University of Perpignan Via Domitia) that uses abstract interpretation to determine the minimal precision required on inputs. The primary current contribution is the application of these techniques to the Internet of Things (IoT) domain.
PRECiSA (Program Round-off Error Certifier via Static Analysis) is a tool from the NIA and NASA that calculates symbolic error bounds using the PVS language. The authors also provide a web interface. More recently, the FPRoCK tool was written in collaboration with authors at the University of Utah to solve mixed real and floating-point programs, which improved the accuracy of PRECiSA.
Victor Magron and others present a framework for providing upper bounds on absolute roundoff errors using semidefinite programming and sums of squares certificates. The results can be checked using the Coq theorem prover.
Rosa is a source-to-source translator written by Eva Darulova (now at MPI-SWS, previously EPFL) for the Scala language. It uses an SMT solver to annotate a program with mixed-precision types.
Salsa is an abstract-interpretation-based floating-point analysis tool that suggests transformations to minimize floating-point error, and is the dissertation work of Nasrine Damouche. The source code is not available.
Satire is a sound rounding error analysis and estimator that uses several techniques like path strength reduction and bound optimization to improve the scalability of static analysis. This analysis scales to hundreds of thousands of operations.
TAFFO is an LLVM-based conversion and autotuning framework that tries to replace floating-point operations with fixed-point operations to the extent possible.

Commercial Tools

Astrée is a commercial static analysis tool for proving the correctness of C code, including all floating-point computation. It has been used to prove the absence of runtime exceptions in flight control software for Airbus planes.
Fluctuat is a commercial, abstract-interpretation-based static analysis tool for analyzing numerical code in C or Ada and characterizing the errors caused by the approximation inherent in floating-point arithmetic. Fluctuat has been used to verify safety-critical embedded systems. It was originally written by Eric Goubault, Matthieu Martel, and Sylvie Putot.
Numalis Software Suite
Numalis Software Suite is a collection of several commercial tools for static and dynamic analysis of floating-point code with the goal of generating more accurate codes and making mixed-precision decisions, targeted primarily at AI and neural network domains.


CompCert is a compiler that has been formally verified to preserve floating-point semantics using Coq.
Fbench / Ffbench
Fbench and Ffbench are two floating-point performance benchmarks by John Walker (Fourmilab).
FLiT: Floating-point Litmus Tests
FLiT is a test infrastructure for detecting variability in floating-point code due to differences in compilers and execution environments. It is currently developed and maintained by Michael Bentley at the University of Utah.
FPBench is a community benchmark project for floating-point examples.
FPHPC Bechmark Suite
This is a collection of programs containing floating-point arithmetic with the immediate goal of aiding the development and evaluation of floating-point precision and correctness analysis tools. The long-term goal is to support and enable the scaling of such tools to provide insights for high-performance computing (HPC) applications. This collection aims to address the "middle ground" between real-valued expressions (such as those contained in the FPBench collection) and full HPC applications.
KLEE is a symbolic execution virtual machine built on LLVM with a benchmark suite of example floating-point programs. There is also a fork that supports reasoning over floating-point constraints.
Posit libraries
Posits are the newest iteration on John Gustafson's Unum representation (see below). Unlike Unums, Posits have a fixed overall width but the number of exponent and significand bits vary according to a run-length-encoded "regime" field. It provides higher accuracy and a greater dynamic range than identically-sized IEEE formats.
Another project from the University of Utah (see FPTaylor and FPTuner above), S3FP is a tool for finding floating-point inputs that will maximize error.
SOAP (Structural Optimisation of Arithmetic Programs) is a tool for optimizing FPGA-based implementations of floating-point code.
Sollya is a library and environment for safe floating-point development, specifically intended to aid the development of math library functions.
Unum library
Unums ("universal numbers") are a novel binary representation for real numbers by John Gustafson that exactly quantifies the uncertainty of a stored value. It extends IEEE floating-point by making both the exponent and signficand fields variable-sized, and it adds an "exact" bit that is set if the number is exact and unset if the number represents a range between representable numbers. It provides a way to obtain mathematically-rigorous results in numeric code at the cost of computational efficiency.
VFLOAT is a a variable-precision floating-point library for FPGAs, developed and maintained by Miriam Leeser at Northeastern University.

This list is primarily maintained for the research community by Mike Lam. If you are developing or are maintaining a floating-point tool, or know of a tool that isn't on the list, please email the list.