ICCFD10, Barcelona

PSA: Abstracts for ICCFD11 are due on 18 Feb 2020!

Amazing destination conference opportunity in Maui, Hawaii

Every two years, the International Conference on Computational Fluid Dynamics is held in a different iconic city across the globe. ICCFD conferences begun in 2000 as the merger of “Numerical Methods in Fluid Dynamics (ICNMFD), and the International Symposium on Computational Fluid Dynamics (ISCFD); which had been running since 1969 and 1985, respectively.” According to the ICCFD homepage.

In early 2018, I submitted two abstracts and was lucky enough to be invited to present a poster on “Scalable HPC Building-Block for Multi-Disciplinary Systems” and complete paper on “Unified Geometries for Dynamic HPC Modeling”. I’ve published a few papers before, but this felt like a new experience; here I was submitting for peer review intimate technical details on the inner workings of some of my original innovations. In the past it was always part of academic group. I worked for two months to bring together my 20-page paper, leaving the presentation until the week of the event…

Pretty familiar experience for most engineers — those minutes leading up to a big presentation…
This balanced tree approach to geometry type taxonomy permits nice inheritance and functional polymorphic optimizations. It is also really interesting to resolve the characteristic similarities and differences between types. There are embedded levels of orthogonality in many spaces. This yields a complete solution space to address all analysis challenges.
In the red, it says that across types we found on order: 50X span in memory usage, 60X span in IO usage, and about 20X speedup when running on GPU vs CPU. Not bad!
Now that the work part is done, I can now look around…
Barcelona is an absolutely gorgeous city, and maintains its own special timeless charm. This is pretty much a typical back alley.
Many interiors have simply stunning craftsmanship and taste. Visual inspiration all around!

The lovely coastal city of Sitges, Spain!
The world-famous multi-century construction project, Sagrada Familia.

Food. Okay let me just say that perhaps my favorite element about Spain was the food. Fresh, flavorful, and affordable. Tapas around great, but we made a point to try many different types of restaurants ranging from known tourist spots to tiny spots that only open for a few hours in the late evening to serve a couple families. Honestly, almost every place we tried had something special about it and several of them we just had to come back for a second or third time. Really, we should have taken more food pictures.

To make up for it, here are a few treats:

Hope to see you at ICCFD11 Summer 2020 in Maui, Hawaii!

high-performance schema for science & engineering

It’s advanced and universal?

Easy to use and FREE??


Wouldn’t it be great to be able to access data easily with any of your favorite languages?

Build advanced apps and workflows?

XCOMPUTE utilizes a new strategy (originally developed by Google) to express complex data between computers / sessions as protocol buffers.

When you save or load to disc or transmit something over a network, the associative data structures present in your computer’s RAM must be flattened (aka serialized), buffered, and eventually reconstructed (aka deserialized) so that they can be transmit in linear fashion across a wire or into a storage device and back again.

There are many ways to do this, but most are not suitable to big data.

We’ve elected to use a special protoc compiler to auto-generate compatible interfaces that provides native access across many languages. They’re essentially feather-weight code headers or libraries that allow you to tie into xcompute.

They also sport speeds approaching the theoretical limits of the attached devices and channels (PCIe, etc).

Messages™ by Xplicit Computing provides standard support for:

While xcompute-server remains a proprietary centerpiece of the XC ecosystem, we’re excited to announce our plan to release our other official Apps, free & open!

This way, everyday users do not have to worry about subscription to xcompute-client. It makes collaboration that much easier.

Hosts maintain their xcompute-server subscriptions and now can invite friends and colleagues freely, and share results as they please with said Apps.

You own and control your data, while Xplicit continues to focus on providing high-quality, unified technologies.

For a technical overview, please read this below excerpt from the README provided with the Messages™ bundle:

for complex systems, FEA, CFD, EDA, and computational geometry


These proto files provide direct access to xcompute messages (file and wire) by generating accessor functions for your favorite languages. This empowers xcompute users and non-users to be able to directly manipulate and access the setup and data to/from xcompute — in a high-performance universal way — suitable for integration with other applications and scripts. Theses four files are central to the xc ecosystem (e.g. both open and proprietary apps), organized as follows:

  • system.proto – domain-specific parameters, associations, references
  • vector.proto – arrays and linear attributes that benefit from packed arena allocation
  • geometry.proto – topological description of elements and regions for a specific domain
  • meta.proto – user profile and meta-data for a specific system

  • This protocol buffer format can deliever high-performance, elegant, and flexible numerical messaging across many domains of science and engineering. (e.g. single- and double-precision floating point data, etc)

  • universal formats for storage and transmission between demanding applications
  • cross-platform accessors and serialization utilities
  • flexible associative and vectorized data patterns
  • object-oriented modularity with reverse and forward compatibility
  • supports multi-threaded I/O within vectors and across files

  • Limitations:
  • maximum individual file size is 2GB, limiting individal systems to ~256M nodes (limited by Google’s 32-bit memory layout)

  • Large systems should be decomposed into several smaller systems if possible — for many reasons. It’s more efficient and accurate to specialize the physics, mediating across regions where required. Try to not solve extra DOF’s unncessarily by making one huge domain that solves everything. Memory requirements vary across methods, but is generally limited by your compute device memory…not the storage format or SSD. It is up to each workgroup to determine what is an appropriate resolution for each study. A top-down systems approach is the best way to resolve from low to high fidelity and maintain accountability across the team…


    Auto-generated bindings are provided for the following langauages: C++, Obj-C, C#, Python, Java, Javascript, Ruby, and Go. In your language environment, import the relevant files as headers or libraries. Statically-compiled languages such as C++, Obj-C, and C# may require linking to static library libxcmessages.a .

    In your environment, various classes should become available. In C++ they can be found under the namespace Messages:: . Refer to the *.proto definitions for how each attribute is defined, knowing that your access pattern is built from these assignments directly. You can access this associative database using getters and setters…

    In C++, the pattern for accessing primitives (bool, int32, int64, float, double, string) looks like:
    auto some_value = msg.something();

    Repeated fields (vectors, etc) can be accessed somewhat normally. Range-based iteration:
    for (auto entry : msg.vector() )
    something = entry;

    Or alternatively for parallel iteration:
    auto N = msg.vector_size();
    #pragma omp parallel for
    for (auto n=0; n'<'N; n++)
    something[n] = msg.vector(n);

    More complex data structures may require mutable getters:
    auto N = other.vector_size();
    //get a reference to mutable object
    auto& vec = *msg.mutable_vector();
    #pragma omp parallel for
    for (auto n=0; n<'N'; n++)
    vec[n] = other.vector(n);

    Please refer to the Proto3 Tutorials for typical programming patterns.


    If you’re an advanced application programmer, you may wish to build upon our bindings to customize against your own projects. This is encouraged as long as existing definitions are not altered. Use a Google Protobuf 3 Compiler to generate your new bindings. A protoc 3 compiler may be readily available in a package manager or installed from online sources or binaries. Proto2 will not work, must be Proto3+.

    After protoc is installed, make a directory for each language and run the compiler from shell or script:
    > mkdir -p cpp python java javascript ruby objc csharp go
    > protoc --cpp_out=cpp --python_out=python --java_out=java --js_out=javascript --ruby_out=ruby --objc_out=objc --csharp_out=csharp vector.proto system.proto spatial.proto meta.proto

    Spaceport Cup 2019

    Once a year a pilgrimage occurs to the mecca of amateur and commercial rocketry, nestled in the desert north of Las Cruces New Mexico. University-sponsored teams from around the world converge on Spaceport America to demonstrate their rocket design, analysis, building, and launching skills… one of the greatest defining moments in their collegiate careers. Teams can compete in certain competition categories, targeting altitudes of 10,000ft and 30,000ft. However, some teams elect to attempt flights beyond — to 50,000ft and up to 100,000ft using solid, hybrid, and liquid propulsion systems.

    This facility is uniquely positioned just west of Whitesands Missile Range, so it is possible to obtain waivers to fly all the way to space on a regular basis, if required… as are the plans of emerging spaceflight companies Virgin Galactic and others.

    It turns out the event was so popular this year that there were zero cars available for rent in the El Paso region. I was able to bum rides for the duration for the trip and made some profound new friendships. The morning of the first flights, we made it out to Spaceport America before sunrise — morning weather is best for launches. The largest building in the complex is currently VG’s new hangar, dubbed “The Gateway to Space”. This quick shot from our car doesn’t do it justice; it’s an incredibly-inspiring facility that represents the emerging commercial spaceflight industry.
    Here’s a view of the Gateway from the Spaceport Operations Center. The state of New Mexico is investing heavily into emerging space industry. Personally, I think these lands will be of significant importance moving forward.
    Here’s the Gateway to Space from the back (2.5 mile runway behind the camera). It was such a treat to meet Spaceport CEO Dan Hicks and his staff. He’s one of the nicest and most competent leaders I’ve encountered in the industry. He’s assembled an incredible team to move things forward, methodically and safely…and inspiring innumerable ppl along the way.
    The Spaceport supports propulsion testing and vehicle flight test operations, spanning solids, hybrid, and liquid engine efforts. Designated areas and modern infrastructure will position them for long-term success. However, they must also be prepared and trained in emergency response. Spaceport America has its own specially-equipped fire and police department on duty 24/7. This new fire truck is one of their arsenal with all modern sensing and life-saving technologies. They work closely with local and federal authorities.
    It’s no easy task to host several thousand students out in the desert for most of a week. There was a designated area for general spectators as well as a restricted area for rocketeers, judges, and volunteers. I’m very grateful to Spaceport America for granting me access to all areas.
    A dry heat was sustained at over 100 degrees most of the day and dust storms tend to pick up as the day progresses. Numerous facilities and comforts were brought in for water, food, shade, and first-aid. Almost everyone was struggling against the heat, but the event did a very good job of providing rest areas and reminding attendees to keep hydrated. The food trucks and icies were especially popular. The people were the best part.
    Over 120 university teams and numerous aerospace companies participated. It was a delight to walk the rows of tents and talk to each team. The international participation was strong (with more than 17 countries represented) and there was a feeling it was special for everyone to interact, share, and support. While this was a competition, the demeanor was one of cooperation and friendship.
    Many well-known universities had a presence. Most had to drive for several days to get their equipment and rocket to the launch site. For many, this was the capstone project for their university major.
    17 teams from Canada participated in the competition. It’s superb to see such comradery between students across borders. Perhaps the UN should be replaced by rocket clubs… 🙂
    University of Washington came in full-force with their awesome liquid rocket project. They would later be crowned the winner of the 2019 Spaceport Cup!
    Switzerland also had several teams with some great innovations and craftsmanship. When I launched rockets about a decade ago, active control systems were extremely rare. Nowadays, teams like these can reliably deploy airbrakes and other control surfaces to help hit their target altitude.
    I was delighted to also meet a few teams from India. Not only did they muster the time and resources to put together some amazing projects, I think it was super special they made the long journey (which is not easy with rocket components).
    This picture does not do it justice. Iowa State University’s rocket appeared to have some of the best aerodynamic designs I’ve seen in rocketry. In particular, their forward control canards were very close to optimized shapes with professional fabrication and positioned for (what appeared to me) proper CG-CP correlation and potential control-ability?! For the expected supersonic flight regime, the vehicle proportions seemed close to optimal.

    It was such a treat to be able to meet with so many of you! Honestly, I think you’re all winners and given that the worst injuries that I heard of were heat exhaustion and a sprained ankle, I would say congratulations for being a part of a world-class rocket event! Hope to see you next year!

    Ad Astra!

    ‘the space show’ called me E.T.

    That’s the pot calling the kettle black!

    Thank you for the kind words and interest! We had a BLAST at Space Access 2019!


    Xplicit Computing gets discussed from 46:21 – 52:20 . Here’s the segment set to some of the presentation material:

    Space Access 2019

    The most interesting space tech conference you’ve never heard about.

    Turns out crazy comes in many different forms:

    Lady and Gentlemen gather on the last eve of Space Access 2019. We were actually kicked out of the hotel bar because we were too excitable. On several occasions, I was approached with the pickup line “want to see my rocket/engine?” (See below)
    I was pretty busy piloting my own XCOMPUTE spaceship. It is powered by high performance computers and good vibes. We were there celebrating our official product launch! Check out emerging capabilities and special offers with our partners R Systems and Rescale.
    Hosted by ERPS and SAS, these events run long and hard, all day. It’s exhausting to soak it all in…each talk is so different and interesting…it feels terrible when you must pick and choose. On Thursday night I had a brief 15 minutes slot to introduce the new platform.

    Here most’s of the video: (though there wasn’t any music in real life, just me talking)
    This short musical montage looks back at our development over the last 5-6 years (from a graphical UI perspective) as we have iterated toward the current enterprise platform. The early 2D FDM prototype codes were truly impressive and beautiful. We’ve taken a detour into more complex 3D FVM, FEM, and other essential methods first before expanding once again. We think this new architecture will give us 100x the power and flexibility of our early numerical codes. Further, we’re looking into the future of advanced machine design and operating systems.
    An incredible spectrum of people were in attendance, spanning advanced amateurs, university researchers, aerospace start-ups, and even notable legends such as directors for major US agencies such as DARPA.

    The event was structured in a way to maximize people connections to facilitate business in space sectors. The first day was focused on practical space entrepreneurship and business activities. The second day was more ambitious trans/cis-lunar and deep-space exploration. The last day was high-risk high-reward concepts with a keen eye on energy/power systems. Probably more than 50% of attendees held an engineering degree and/or industry experience.
    After another long day of talks we were excited to get an exclusive update from SpaceIL founder and recent attempts at landing their Beresheet spacecraft on the lunar surface. Huge inspiration to all, despite the terrible connection and A/V issues.
    This is how we have fun and put our bench-top rocket fuel pumps to use when not on exhibit or moving hypergols. Explosions were controlled, mostly. Two ranging margarita parties fueled some of the leading rocket scientists to get belligerent and bash scramjets. Because we’re all so agreeable…
    Going into the event I didn’t really have anything good to show. Long story short, the night/morning before I set up my computer in my hotel room and ran a 6.7M element CFD with the A/C directly into that Titan-Z going at full blast. I saved 1/10 of the 10,000 iterations to yield 350 GB of data in about 4 hours. (Each frame is about 350 MB)

    unified graphics controller

    XCOMPUTE’s graphics architecture is built on OpenGL 3.3 with some basic GLSL shaders. The focus has always been on efficiency and usefulness with large engineering data sets – it is meant to visualize systems.

    However, along the way we recognized that we could unify all graphics objects (technically, vertex array objects) in our render pipeline as to not only handle 3d objects, topologies, and point clouds, but provide a powerful framework for in-scene widgets and helpers. We’ve barely started on that:

    A basic in-scene color bar utilizes multiple graphics components: Text is rendered with glyph texture atlases, the color legend uses a similar texture technique but with PNG infrastructure. The pane itself is also a graphics object, each with unique definition and functions but unified stylistic and graphics controls. Note, that behind the color bar is the simulation and its meta-regions are also graphics objects with similar capabilities and controls.

    As we’re getting ready to launch the product, I’m connecting modules that perhaps didn’t have priority in the past. The other night, I spent a few hours looking at what easy things we could do with a unified “appearance” widget, built in the client with Qt in about 130 lines:

    XCOMPUTE’s Appearance Widget allows users to customize color, transparency, textures, and render modes. Style control flags are updated instantly, not requiring any data synchronization!

    I then imported a complex bracket geometry and applied a wood PNG texture with RGBA channels projected in the Z-direction:

    Triply-periodic bracket structure (held on left, loaded on right) imported as STL and meshed in XCOMPUTE with 440K cells in about 10 minutes. Shown is a highly-computable finite element mesh based on Per-Olof’s Perrson’s thesis.

    This looks pretty good for rasterization (60fps @ 1440×2560), but it isn’t perfect….there are a few artifacts and shadowing is simplified. I think the space between the wood slats is really cool and makes me want to grab this thing and pull it apart. Those gaps are simply from the alpha-channel of the PNG image…just for fun. We’ll expose more bells and whistles eventually.

    Soon, I’ll show the next step of analyzing such a component including semi-realistic displacement animations.

    In the future (as we mature our signed distance infrastructure), we make look at ray-tracing techniques, but for now the focus is on efficiency for practical engineering analyses.

    server-client protocol buffer specs

    It’s no secret around here that I’ve been burning the candle from both ends in order to complete “The Great Server-Client Divide” as we call this year-long task. A task that has been in planning since the very start.

    With big-data applications, its challenging to get a server (simulation state machine) to interact (somewhat generically) with any number of clients without compromising on performance. We studied the principles and the mechanics of this issue and slowly arrived at a viable solution requiring extreme software engineering care.

    For our engineering analysis software, we navigated many performance compromises. One notable compromise (compared to game engines) has been on maintaining both high (FP64) and low precision (FP32) data sets for computation vs render — every iteration we must convert and buffer relevant results from device to host in order to maintain a global state with which clients can interact.

    (Still, we are finding that proper software design yields a compute bottleneck in GPU-like devices, rather than I/O bandwidth limitation over PCIe — so this extra process is not responsible for any slowdown. We’re measuring and reporting more than 25x speed-up over CPU-only).

    A sever-client architecture utilizes a central sever host with any number of OpenCL-compatible devices and filesystem. One or more clients can connect to the server through a network connection, communicating needs and accepting pre-packaged data from the server. The client renders the data using local GPU device.

    XCOMPUTE has gone through several thousand iterations to get where we are, and along the way we developed high-level and low-level optimizations and generalizations to further expand our capabilities and performance. For instance, we are approaching the minimum number of operations to synchronize arbitrary numerical data — and our C++ code syntax makes all these operations very clear and human-readable.

    It should bit little surprise that eventually there would be a high degree of data structure unification (via dynamic compile-time and run-time tricks), and that the messages required to save/load could possibly be reused in wide-scale communication protocols. After all, both messages require serialization and de-serialization infrastructure, so if the encoding/decoding format is flexible and nearly run-time optimal, why not unify all I/O? Especially if it is easily parallelized and permits flexible usage and sharing with users.

    That is exactly what we did; we implemented “protocol buffers” using a schema file definiton to build an array of sources, headers, and libraries that are later linked by the larger application during compile. There are no run-time libraries…it’s essentially a code generator.

    The protobuf definition file assigns variable names to types and a specific integer spot; repeated and embedded messages are also possible. Developers have a clear way to package messages and the proto definition file can be made publicly available to bind external applications (to almost any language) to natively interface without compromising the legal intellectual property of the actual code-base. It’s just an interface.

    I’m only aware of two good protocol buffer libraries, both authored by the same person (first at Google, then on his own). The only major limitation I’ve encountered is that for both libraries (for various reasons), the maximum message size is limited to about 2^30 bytes, or about 1GB. This presents a challenge to the size of any one system, but should work well for most as large problems should be decomposed into manageable systems, not one huge homogeneous domain with poor numerical complexity.

    I could talk for days about message design and how it sort-of parallels your class structures — and how it also is sort of its own thing! Being introspective on “what constitutes a message” can yield huge optimizations across your application in practice. This is because if messages are not well-encapsulated, they will tend to have repetitive or unnecessary data per the context. Ideally, you’d only transmit what is needed, especially given bandwidth constraints. If you can constrain this to a finite set of messages, you’re off to a great start.

    Another really neat byproduct of sever-client message unification is that servers already expect self-contained protobuf messages in order to perform operations, such as creating new objects (geometries, algorithms, etc). A command line interface (CLI) could also construct protobuf messages and invoke macro-level commands, just like a client. One could access a simulation via client, CLI, or through files on disk.

    Applied to numerical computing, we developed four protocol buffer definition files, each applicable to specific contexts:

    • vector – array-like data that can benefit from arena allocations
    • geometry – topological information specific to a domain
    • setup – numerical system configuration data and associativity
    • meta – user preferences for a specific system

    XCOMPUTE has implemented these messages for finite element, finite volume, and are formalizing support for finite difference, lattice-Boltzmann, and advanced geometric representations. The following unified XCOMPUTE file types that somewhat correspond to the aforementioned messages:

    • *.xco – numerical data array for a specific property-key (parallel CPU)
    • *.xcg – topology data for a specific geometry (structured / unstructured)
    • *.xcs – system setup (main project file, recursive directories)
    • *.xcm – metaobject media profile (technically a new 3d media format)

    RSA or other encryption can wrap the serialized byte-stream as necessary. When you purchase an XCOMPUTE license, you receive a copy of these definitions along with a Creative Commons Attribution Non-Derivatives license to allow anyone to use them for their own projects and hopefully integrate with ours!