Wouldn’t it be great to be able to access data easily with any of your favorite languages?
Build advanced apps and workflows?
XCOMPUTE utilizes a new strategy (originally developed by Google) to express complex data between computers / sessions as protocol buffers.
When you save or load to disc or transmit something over a network, the associative data structures present in your computer’s RAM must be flattened (aka serialized), buffered, and eventually reconstructed (aka deserialized) so that they can be transmit in linear fashion across a wire or into a storage device and back again.
There are many ways to do this, but most are not suitable to big data.
We’ve elected to use a special protoc compiler to auto-generate compatible interfaces that provides native access across many languages. They’re essentially feather-weight code headers or libraries that allow you to tie into xcompute.
They also sport speeds approaching the theoretical limits of the attached devices and channels (PCIe, etc).
Messages™ by Xplicit Computing provides standard support for:
While xcompute-server remains a proprietary centerpiece of the XC ecosystem, we’re excited to announce our plan to release our other official Apps, free & open!
This way, everyday users do not have to worry about subscription to xcompute-client. It makes collaboration that much easier.
Hosts maintain their xcompute-server subscriptions and now can invite friends and colleagues freely, and share results as they please with said Apps.
You own and control your data, while Xplicit continues to focus on providing high-quality, unified technologies.
For a technical overview, please read this below excerpt from the README provided with the Messages™ bundle:
XCOMPUTE MESSAGES — READ MEXC MESSAGES 2019
UNIVERSAL HIGH-PERFORMANCE NUMERIC SCHEMA / FORMAT
for complex systems, FEA, CFD, EDA, and computational geometry
These proto files provide direct access to xcompute messages (file and wire) by generating accessor functions for your favorite languages. This empowers xcompute users and non-users to be able to directly manipulate and access the setup and data to/from xcompute — in a high-performance universal way — suitable for integration with other applications and scripts. Theses four files are central to the xc ecosystem (e.g. both open and proprietary apps), organized as follows:
vector.proto – arrays and linear attributes that benefit from packed arena allocation
geometry.proto – topological description of elements and regions for a specific domain
meta.proto – user profile and meta-data for a specific system
This protocol buffer format can deliever high-performance, elegant, and flexible numerical messaging across many domains of science and engineering. (e.g. single- and double-precision floating point data, etc)
universal formats for storage and transmission between demanding applications
cross-platform accessors and serialization utilities
flexible associative and vectorized data patterns
object-oriented modularity with reverse and forward compatibility
supports multi-threaded I/O within vectors and across files
maximum individual file size is 2GB, limiting individal systems to ~256M nodes (limited by Google’s 32-bit memory layout)
Large systems should be decomposed into several smaller systems if possible — for many reasons. It’s more efficient and accurate to specialize the physics, mediating across regions where required. Try to not solve extra DOF’s unncessarily by making one huge domain that solves everything. Memory requirements vary across methods, but is generally limited by your compute device memory…not the storage format or SSD. It is up to each workgroup to determine what is an appropriate resolution for each study. A top-down systems approach is the best way to resolve from low to high fidelity and maintain accountability across the team…
B. USING XCOMPUTE BINDINGS FOR YOUR PROJECT
In your environment, various classes should become available. In C++ they can be found under the namespace Messages:: . Refer to the *.proto definitions for how each attribute is defined, knowing that your access pattern is built from these assignments directly. You can access this associative database using getters and setters…
In C++, the pattern for accessing primitives (bool, int32, int64, float, double, string) looks like:
auto some_value = msg.something();
Repeated fields (vectors, etc) can be accessed somewhat normally. Range-based iteration:
for (auto entry : msg.vector() )
something = entry;
Or alternatively for parallel iteration:
auto N = msg.vector_size();
#pragma omp parallel for
for (auto n=0; n'<'N; n++)
something[n] = msg.vector(n);
More complex data structures may require mutable getters:
auto N = other.vector_size();
//get a reference to mutable object
auto& vec = *msg.mutable_vector();
#pragma omp parallel for
for (auto n=0; n<'N'; n++)
vec[n] = other.vector(n);
If you’re an advanced application programmer, you may wish to build upon our bindings to customize against your own projects. This is encouraged as long as existing definitions are not altered. Use a Google Protobuf 3 Compiler to generate your new bindings. A protoc 3 compiler may be readily available in a package manager or installed from online sources or binaries. Proto2 will not work, must be Proto3+.
After protoc is installed, make a directory for each language and run the compiler from shell or script:
XCOMPUTE’s graphics architecture is built on OpenGL 3.3 with some basic GLSL shaders. The focus has always been on efficiency and usefulness with large engineering data sets – it is meant to visualize systems.
However, along the way we recognized that we could unify all graphics objects (technically, vertex array objects) in our render pipeline as to not only handle 3d objects, topologies, and point clouds, but provide a powerful framework for in-scene widgets and helpers. We’ve barely started on that:
As we’re getting ready to launch the product, I’m connecting modules that perhaps didn’t have priority in the past. The other night, I spent a few hours looking at what easy things we could do with a unified “appearance” widget, built in the client with Qt in about 130 lines:
I then imported a complex bracket geometry and applied a wood PNG texture with RGBA channels projected in the Z-direction:
This looks pretty good for rasterization (60fps @ 1440×2560), but it isn’t perfect….there are a few artifacts and shadowing is simplified. I think the space between the wood slats is really cool and makes me want to grab this thing and pull it apart. Those gaps are simply from the alpha-channel of the PNG image…just for fun. We’ll expose more bells and whistles eventually.
Soon, I’ll show the next step of analyzing such a component including semi-realistic displacement animations.
In the future (as we mature our signed distance infrastructure), we make look at ray-tracing techniques, but for now the focus is on efficiency for practical engineering analyses.
It’s no secret around here that I’ve been burning the candle from both ends in order to complete “The Great Server-Client Divide” as we call this year-long task. A task that has been in planning since the very start.
With big-data applications, its challenging to get a server (simulation state machine) to interact (somewhat generically) with any number of clients without compromising on performance. We studied the principles and the mechanics of this issue and slowly arrived at a viable solution requiring extreme software engineering care.
For our engineering analysis software, we navigated many performance compromises. One notable compromise (compared to game engines) has been on maintaining both high (FP64) and low precision (FP32) data sets for computation vs render — every iteration we must convert and buffer relevant results from device to host in order to maintain a global state with which clients can interact.
(Still, we are finding that proper software design yields a compute bottleneck in GPU-like devices, rather than I/O bandwidth limitation over PCIe — so this extra process is not responsible for any slowdown. We’re measuring and reporting more than 25x speed-up over CPU-only).
XCOMPUTE has gone through several thousand iterations to get where we are, and along the way we developed high-level and low-level optimizations and generalizations to further expand our capabilities and performance. For instance, we are approaching the minimum number of operations to synchronize arbitrary numerical data — and our C++ code syntax makes all these operations very clear and human-readable.
It should bit little surprise that eventually there would be a high degree of data structure unification (via dynamic compile-time and run-time tricks), and that the messages required to save/load could possibly be reused in wide-scale communication protocols. After all, both messages require serialization and de-serialization infrastructure, so if the encoding/decoding format is flexible and nearly run-time optimal, why not unify all I/O? Especially if it is easily parallelized and permits flexible usage and sharing with users.
That is exactly what we did; we implemented “protocol buffers” using a schema file definiton to build an array of sources, headers, and libraries that are later linked by the larger application during compile. There are no run-time libraries…it’s essentially a code generator.
The protobuf definition file assigns variable names to types and a specific integer spot; repeated and embedded messages are also possible. Developers have a clear way to package messages and the proto definition file can be made publicly available to bind external applications (to almost any language) to natively interface without compromising the legal intellectual property of the actual code-base. It’s just an interface.
I’m only aware of two good protocol buffer libraries, both authored by the same person (first at Google, then on his own). The only major limitation I’ve encountered is that for both libraries (for various reasons), the maximum message size is limited to about 2^30 bytes, or about 1GB. This presents a challenge to the size of any one system, but should work well for most as large problems should be decomposed into manageable systems, not one huge homogeneous domain with poor numerical complexity.
I could talk for days about message design and how it sort-of parallels your class structures — and how it also is sort of its own thing! Being introspective on “what constitutes a message” can yield huge optimizations across your application in practice. This is because if messages are not well-encapsulated, they will tend to have repetitive or unnecessary data per the context. Ideally, you’d only transmit what is needed, especially given bandwidth constraints. If you can constrain this to a finite set of messages, you’re off to a great start.
Another really neat byproduct of sever-client message unification is that servers already expect self-contained protobuf messages in order to perform operations, such as creating new objects (geometries, algorithms, etc). A command line interface (CLI) could also construct protobuf messages and invoke macro-level commands, just like a client. One could access a simulation via client, CLI, or through files on disk.
Applied to numerical computing, we developed four protocol buffer definition files, each applicable to specific contexts:
vector – array-like data that can benefit from arena allocations
geometry – topological information specific to a domain
setup – numerical system configuration data and associativity
meta – user preferences for a specific system
XCOMPUTE has implemented these messages for finite element, finite volume, and are formalizing support for finite difference, lattice-Boltzmann, and advanced geometric representations. The following unified XCOMPUTE file types that somewhat correspond to the aforementioned messages:
*.xco – numerical data array for a specific property-key (parallel CPU)
*.xcg – topology data for a specific geometry (structured / unstructured)
*.xcs – system setup (main project file, recursive directories)
*.xcm – metaobject media profile (technically a new 3d media format)
RSA or other encryption can wrap the serialized byte-stream as necessary. When you purchase an XCOMPUTE license, you receive a copy of these definitions along with a Creative Commons Attribution Non-Derivatives license to allow anyone to use them for their own projects and hopefully integrate with ours!
About six years ago, I was fortunate to receive hundreds of hours of guidance from the CFD chairman at Boeing (now at Blue Origin). As my startup’s acting VP of Research, he helped us establish technical requirements for a new simulation platform for next-gen systems. He set us on a path, and I worked to bring it all together, pulling from a spectrum of experiences at JPL, Blue Origin, and Virgin Galactic…
Why does hypersonic flight require a new engineering approach?
By definition, “hypersonic” means much faster than sound. There does not appear to be a formal demarcation between supersonic and hypersonic, but design philosophies start to deviate markedly as kinetics take over. At sufficient speed and conditions, traditional compressible flow theory becomes inaccurate due to additional energy modes of excitation, storage, and transmission (that were not included in the original model). As specific kinetic energy approaches molecular bond energies, a distribution undergoes dissociation, inhibiting chemical reformation reflected in further-limiting reaction progress (“Damkohler numbers”). A transition occurs as radiation dominates thermal modes. Plasma density increases as free stream energy density approach valence electronic Gibbs potentials. At some point, you can’t extract net positive work because combustion doesn’t progress (until recombination outside the engine).
For air, I’d say hypersonic phenomena onset around M~6. Very few vehicles to date (or planned) have such capabilities, obviously.
However, I think it is within our technological grasp to cruise at Mach 15+ with the right configuration and engineering approach, enabling point-to-point travel and booster services for deploy-ables and satellites.
In time, I intend to demonstrate a clear pathway forward. First we must understand the basic principles and underlying processes…
Perhaps close or slightly worse than current commercial high-bypass turbo-jet engines, but certainly worse than future hydrogen turbo-jets!
However, a marked performance improvement over traditional hydrogen-oxygen rocket performance since not only does the air-breathing vehicle not have to carry its own fuel, but it can control the effective specific impulse by varying the ratio of bypass (air) to heat input (fuel).
To move beyond traditional liquid-fueled rockets for high-speed trans-atmospheric flight, we can extract more thrust-per-Watt out of an air-breathing engine by including more air (“working fluid”) in the propulsive process at a slower jet speed (difference between engine outlet and inlet velocities). We essentially spread the jet power to maximize fuel efficiency (“effective specific impulse”) and to have jet outlet velocity match free-stream speed to maximize jet kinetic efficiency. (Ideally, once ejected, exhaust would stand still in the reference frame of the fluid. However this is not possible at very low speeds due to minimal mass flux through engine generating minimal net thrust, albeit at very high efficiency! At high Mach numbers, there isn’t enough delta-v in the exhaust to keep up with vehicle speed, and a gradual drop in thermodynamic efficiency is expected.)
The average kinetic energy of the vehicle scales as the square of its speed, while the power required to sustain flight scales as the cube.
What does this mean about powered vehicles that fly very fast?
As vehicle power-density scales as speed cubed, propulsion starts to dominate the design of the vehicle in hypersonics. The vehicle becomes a big flying engine as M->25, and the project schedule and funding should reflect this. Based on flight profile and lift requirements, a linear “wave-rider” design may be considered vs more practical annular layout (which also is more efficient at carrying large thermal-stress loads and propellant storage). Fuel density remains important, but not as much as net specific energy density.
Sub-cooled liquid hydrogen is used as fuel and coolant, and if pressed above supercriticality, has insane heat capacity — but at a cost of varying density (and Nusselt number used in regen cooling analysis). Both active and passive cooling strategies are required to offset vehicle and engine heat transfer. An open cycle is unacceptable to overall performance, so boundary layer coolant (BLC) must be injected on leading surfaces and ingested / combusted (as part of a turbulent shock-detonation inlet). Combustion takes place in specialized sub-sonic burners before being mixed with the primary flow as part of a closed staged-combustion cycle. Liquid oxygen is supplemented to the combustors for take-off and LEO injection.
Engine length becomes an impediment in smaller vehicles (such as those encountered by any research/test program) due to finite combustion reaction time, requiring longer characteristic chamber length to ensure relatively-complete combustion (Damkohler numbers close to one). Net chemical power extraction is balanced against thermal and drag impediments, so the systems must balance all these and resolve rate reacting large eddy simulation (LES), as physical testing will have inherent limitations to replicate and measure combustion environment. Simulations are used for analysis and optimization and to characterize transfer functions to be applied as the machine’s advanced onboard control system.
Although a hypersonic compressor and diffuser does not use rotating turbomachinery (per excessive thermal-stresses), supporting cooling and fluid control systems remain a large-scale systems engineering challenge. The technical scope is akin to a nuclear power plant that can fly and requires multiples modes of operation. Structural engineering must make no assumptions regarding thermal and acoustic environments as the vehicle will pass through many regimes, expected and off-nominal. Quantifying dynamic load environments require experiment or flight experience, as computing resources to resolve turbulent micro-structures scale as the Reynolds number to 9/4 power, more than square of speed!
To have any hope to getting this right, we must have a very strong concept and technology basis. We need a good initial vector and structured yet flexible approach…so defining the problem by systems and subsystems provides the exact encapsulation and recursive definition required to be infinitely interchangeable and expandable (only limited by computing resources). These tools must be intuitive and powerful as to fully-leverage parallel computing so analysis doesn’t continue to be the bottleneck:
START WITH THE DATA LAYERS
There’s obviously a lot of competing factors in advanced aerospace and energy systems. To integrate these different domains (fluid, thermal, mechanical, electronic) we need an alternative to the current isolated unidirectional “waterfall” engineering process. We need a unified HPC platform everyone can use to integrate systems, not just fluids or solids.
To take steps beyond theory into practice — to actually conceptualize, design, analyze, and build these systems, we need some amazing software and sustained discipline across many teams. Realistically, the problem must be approached with a strong systems framework and restraint on exotics. (“Can I personally actually build this?”) I’ve been participating in various AIAA and peer-review conferences over past years, and there is certainly some impressive work out there. I think the CREATE suite from the DoD has taken a real but ambitious approach to give the military turn-key analysis tools. However, I haven’t seen many commercial or academic firms with their eye (or checkbook) on the systems challenge of next-gen engineering — let alone an architecture that not only demonstrates multi-disciplinary functionalities now (CFD, FEA, etc) while remaining relevant to future computing.
I pulled away from the aerospace industry to dedicate just under 20,000 hours to this software infrastructure, collaborating with a few bright graduate researchers at Stanford, MIT, and the Von Karman Institute. We made hundreds of thousands of code contributions across more than two-thousand commits. We burned through a small fortune of friends and family investments and leveraged technology to work more efficiently towards decadal objectives of NASA. Things, we have reason to believe, few are attempting. It is now getting exciting…
Despite funding obstacles, we’ve broken through major barriers and are ready to apply our new advanced engineering platform to new projects — leveraging modern software machinery (C++14, OpenCL) and processing hardware (CPU, GPU, FPGA). Our integrated engineering environment provides end-to-end capabilities for such grand challenges. We can now build simulations out of different systems and algorithms and dispatch them to any processor. Aerospace is only the first use case.
You’ve really got to be a fearless generalist to take on something like this. But you’ve also got to be able to dive deep into key areas and understand the process on first and zeroth principles. Many fields of mathematics and technical practice, concentrated into an applied real-world problem. Since you can’t rely on books for answers to new questions, we must inquire the fundamental laws and be cognizant of our human constructs and assumptions made therein.
Is it possible to optimize against physics while also providing a practical engineering path?
I’ve pondered such quandaries for many years, but now I think I have a clear path. Over the next few years I hope to demonstrate and share what I can on this blog.
P.S. All this talk about jet engine thrust reminds me of this time a senior engineer at Blue Origin emailed a challenge question to the company along the lines of – if force is the integral of pressure times area, what parts of a jet engine are most responsible for its net thrust generation?
Do you know?
It appears most of the company did not. I took a stab:
It’s the pressure differential across the bypass compressor blades, probably followed by the central jet exit (and its compressor blades and internal cowling).
I hope to share some fun stuff in the realm of numerical simulation, machine design, and aerospace systems. Realistically, lots of other crazy stuff will pop up along the way….
A bit about me: educated at Harvey Mudd College in general engineering. Had a few internships at NASA JPL/Caltech supporting Dawn, MSL, and hypervelocity impactor programs. Then a short stint at Blue Origin developing engine infrastructure. Found myself reinventing CFD numerics on my laptop in Matlab to address engine and facility challenges at Virgin Galactic. Then founded Xplicit Computing and worked very hard to bring all the best ideas and people together. Broke code and new ground in HPC…
Over the last five years I’ve been very focused on building the software data layers to enable next-gen engines and power systems. XCOMPUTE enables us to define and simulate complex systems building blocks for heterogeneous (CPU/GPU) algorithm processing. This means we can solve fluid, solid, or any other problem in a unified architecture, leveraging the latest in C++ and OpenCL. Powerful advanced simulation is now available on desktop computers! Computing tools are now accessible to many more people…a huge impact on small and big businesses.
I’m also into a variety of music (piano improv, percussion, etc), cultural foods (many types), and philosophy (Spinoza-Einstein).
Very exciting new things on the horizon. Stay tuned…