FlowSieve

FlowSieve is developed as an open resource by the Complex Flow Group at the University of Rochester, under the sponsorship of the National Science Foundation and the National Aeronautics and Space Administration. Continued support for FlowSieve depends on demonstrable evidence of the code’s value to the scientific community. We kindly request that you cite the code in your publications and presentations. FlowSieve is made available under The Open Software License 3.0 (OSL-3.0) (see the license file or the human-readable summary at the end of the README), which means it is open to use, but requires attribution.

The following citations are suggested:

For journal articles, proceedings, etc.., we suggest:

Storer et al., (2023). FlowSieve: A Coarse-Graining Utility for Geophysical Flows on the Sphere. Journal of Open Source Software, 8(84), 4277, (https://doi.org/10.21105/joss.04277)

Other articles that may be relevant to the work are:

Storer, B.A., Buzzicotti, M., Khatri, H. et al. Global energy spectrum of the general oceanic circulation. Nat Commun 13, 5314 (2022). (https://doi.org/10.1038/s41467-022-33031-3)
Aluie, Hussein, Matthew Hecht, and Geoffrey K. Vallis. "Mapping the energy cascade in the North Atlantic Ocean: The coarse-graining approach." Journal of Physical Oceanography 48.2 (2018): 225-244: (https://doi.org/10.1175/JPO-D-17-0100.1)
Aluie, Hussein. "Convolutions on the sphere: Commutation with differential operators." GEM-International Journal on Geomathematics 10.1 (2019): 1-31: (https://doi.org/10.1007/s13137-019-0123-9)

For presentations, posters, etc.., we suggest acknowledging:

FlowSieve code from the Complex Flow Group at University of Rochester

Primary Features

computes coarse-grained scalar and vector fields for arbitrary filter scales, in both Cartesian and spherical coordinates,
built-in diagnostics for oceanographic settings, including kinetic energy (KE), KE cascades, vorticity, divergence, etc.,
built-in post-processing tools compute region averages for an arbitrary number of custom user-specified regions [ avoiding storage concerns when handling large datasets ], and
includes Helmholtz-decomposition scripts to allow careful coarse-graining on the sphere [ i.e. to maintain commutativity with derivatives ].

Usage

Users can expect to compile the executables in 'Case Files' and use them as command-line utilities to process existing netCDF-4 data. The tutorials illustrate the steps required for this usage, as well as highlighting the kind of outputs / analysis that can be obtained.
Developers can use FlowSieve as a C++ library and develop additional diagnostic / analysis routines using the FlowSieve codebase.

Statement of need

Aluie 2018 demonstrated how, when applied appropriately, coarse-graining can not only be applied in a data-processing sense, but also to the governing equations. This provides a physically meaningful and mathematically coherent way to quantify not only how much energy is contained in different length scales, but also how much energy is being transferred to different scales.

FlowSieve is a heavily-parallelized coarse-graining codebase that provides tools for spatially filtering both scalar fields and vector fields in Cartesian and spherical geometries. Specifically, filtering velocity vector fields on a sphere provides a high-powered tool for scale-decomposing oceanic and atmospheric flows following the mathematical results in Aluie 2019.

FlowSieve is designed to work in high-performance computing (HPC) environments in order to efficiently analyse large oceanic and atmospheric datasets, and extract scientifically meaningful diagnostics, including scale-wise energy content and energy transfer.

Verifying Installation

The tutorials, in addition to providing introductory instruction to using FlowSieve, also provide a way to verify that your installation is working as expected. The provided Jupyter notebooks include the figures that were generated by the developers, and provide a reference. As always, feel free to contact the developers for assistance (see Community Guidelines below).

Community Guidelines

Contributing
- At the current stage of development, anyone seeking to contribute to the FlowSieve codebase is asked to contact the main developers (see Seeking Support) to discuss the best way to integrate their contributions. The codebase is maintained on GitHub, and contributions will ultimately result in merging commits into the main branch.
- It is recommended to use a forked repository for active development, since it allows testing in a separate environment before merging.
Reporting Issues
- Please report issues using the GitHub issue tracking tools. Issues can also be submitted by email (see Seeking Support), but the issue tracker is preferred.
Seeking Support
- The best way to obtain support is to contact Hussein Aluie or Benjamin Storer by email. Contact information is available at the Complex Flow Group webpage (http://www.complexflowgroup.com/people/).

Tutorial

A series of basic tutorials are provided to outline both various usage cases as well as how to use / process the outputs.

Methods

Some details regarding underlying methods are discussed on this page (warning, math content).

Helmholtz Decomposition

For notes about the Helmholtz decomposition, go to this page.

Compilation / Installation

For notes on installation, please see this page.

Input / Output Files and Filetypes

The coarse-graining codebase uses netcdf files for both input and output. Dimension orderings are assumed to follow the CF-convention of (time, depth, latitude, longitude).

scale_factor, offset, and fill_value attributes are applied to output variables following CF-convention usage.

Where possible, units and variables descriptions (which are provided in constants.hpp) are also include as variable attributes in the output files.

Currently, no other filetypes are supported.

Postprocessing

Post-processing (such as region-averaging, Okubo-Weiss histogram binning, time-averaging, etc) can be enabled and run on-line by setting the APPLY_POSTPROCESS flag in constants.hpp to true.

This will produce an additional output file for each filtering scale.

Various geographic regions of interest can be provided in a netcdf file.

Command-line Arguments

--version
- Calling ./coarse_grain.x --version prints a summary of the constants / variables used when compiling
--help
- Calling ./coarse_grain.x --help prints a summary of the run-time command-line arguments, as well as default values (if applicable).

Specifying Filtering Scales

When specifying filtering scales, consider a wide sweep. It can also be beneficial to use logarithmically-spaced scales, for plotting purposes. Python can be helpful for this. For example, numpy.logspace( np.log10(50e3), np.log10(2000e3), 10 ) would produce 10 logarithmically-spaced filter scales between 50km and 2000km.

Hint: to print filter scales to only three significant digits, the numpy.format_float_scientific function can help.

import numpy as np
number_of_scales = 10
smallest_scale = 50e3
largest_scale  = 2000e3
scales = np.logspace( np.log10(smallest_scale), np.log10(largest_scale), number_of_scales )
[print( np.format_float_scientific( scale, precision = 2 ), end = ' ' ) for scale in scales]

If you are using a bash script (e.g. a job-submission script), an easy way to pass the filter scales on to the coarse-graining executable is to define a variable that has the list of scales, and then just pass that to the executable using the –filter_scales flag.

FILTER_SCALES="1e4 5e4 10e4"
--filter_scales "${FILTER_SCALES}"

Known Issues

Some known issues (with solutions where available) are given on this page.

Technical Matters

DEBUG flag

Setting the debug flag in the Makefile specifies how much information is printed during runtime.

This list may not be quite up-to-date. Rule of thumb:

Use DEBUG <= -2 silences netcdf errors
Use DEBUG = 0 for normal production runs
Use DEBUG = 1 if you want to keep track on the progress of a longer production run
Use DEBUG = 2 if you're running into some issues and want to narrow it down a bit
Going beyond this is really only necessary / useful if you're running into some fatal errors that you can't pinpoint
Setting DEBUG to be negative is generally not advised. Setting to 0 shouldn't produce overly much output, and certainly not enough to hamper performance. If you're trying to silence errors, make sure you understand why the errors are happening, and that you're really okay with ignoring them.

Additionally, setting DEBUG>=1 will result in slower runtime, since it enables bounds-checking in the apply-filter routines ( i.e. vector.at() vs vector[] ). These routines account for the vast majority of runtime outside of very small filter scales (which are fast enough to not be a concern), and so this optimization was only applied those those routines.

Function Map

See the function map for the main filtering function to get an overview of the function dependencies.

Licence

This is a brief human-readable summary of the OSL-3.0 license, and is not the actual licence. See licence.md for the full licence details.

You are free:

To share: To copy, distribute and use the database.
To create: To produce works from the database.
To adapt: To modify, transform and build upon the database.

As long as you:

Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database.

Table of Contents