Data-driven Fluid Simulation
Abstract
This project examines the feasibility of simulating fluids using a data-driven approach similar to pixel-based texture synthesis techniques [Efros99, Wei00].
The training data is used to compute a new simulation by matching local neighborhoods in the simulation with similar neighborhoods in the training data. This computation reduces to a nearest neighbor search in a high dimensional space.
Algorithm Details
The input data is a sequence of vector and scalar fields. This data is examined as a stream along the temporal dimension. The source of the data can either be another simulation or measurements from a real fluid. In practice we will be dealing with discrete scalar and vector fields on a uniform grid.
The training data is converted into a set of representative samples. Each sample is a vector of the parameters present in a small window of the fluid simulation. This is referred to as a neighborhood. Each vector is also associated with its key cell's state in the next timestep. They key cell is the cell in the center of the neighborhood. We will call the key cell's next state the predicted state. The key is the cell a match is attempting to compute and its predicted state is the key one timestep later. If an area of the simulated fluid nearly matches a sample vector, then it's key cell is replaced by the predicted state cell for the next timestep.
Here is an illustration of the step procedure using a 3-by-3 neighborhood:
Neighborhood diagram
For each neighborhood in the simulation data (lower-left), the training data is searched for a similar neighborhood. The most similar neighborhood is then used (upper-left) to determine what the simulation should do. This is done by simply copying the state of the key cell in the next timestep of the training data to the simulation. The assumption is that if the input neighborhoods were similar in the previous timestep, then they should be do similar things in this simulation.
The simulation state is a uniform grid of the fluid parameters. I can be initialized by simply filling values form the training data. This will likely result in an inconsistent initial state, causing the simulation to have very poor matches for the first few iterations. In practice it appears to converge on better matches quickly. Other possibility for the initial conditions uses domain specific knowledge to fill in values that the operator knows are physically realistic.
At this point the algorithm is only able to simulate passive flows. In general we might wish to add an external force (f). The following formula computes the change in velocity as a function of density (d) and area/volume of a cell (A).
External force formula
The simulation loop is composed of two steps:
  • Apply external forces
  • Compute the next timestep using neighborhoods
Implementation And Results
The test implementation simulates a 2D fluid using source data from gerris. A neighborhood of 3 by 3 was used. Each cell has associated velocities (u, v) and a pressure p. This results in nearest neighbor searches in 27 dimensions.
The results were visualized using a 2D vector field on a uniform grid. To better view the flow, particles were randomly added and advection. Two background coloring schemes were used. The first highlighted how well each neighborhood matched its nearest neighbor. Black cells had good matches. Red cells poorly matched. The second color scheme was a post-process to check if the incompressibility of the fluid was being preserved (it was known that the training data was incompressible). Blue areas indicated regions losing density. Red areas indicated regions gaining density. (Click on either image to view the corresponding video.)
Simulation: neighbor error
Simulation: density changes
Limitations and Future Work
The largest problem is that the current technique tends to converge on solutions that do no preserve density. This tends to manifest itself as sinks and sources along the boundaries, particularly at corners.
Another major issue is that Euclidean distance used on vectors with inconsistent dimensions. The current implementation solves this by normalizing all the dimensions, but this may not be the best solution. I do not have any mathematical justification for this other than it seems better than simply comparing velocity and pressure outright.
This simulation, like many fluid simulations, is subject to the CFL condition. In this case, the limitation is that information cannot be propagated faster than the neighborhood size divided by the timestep.
The bottleneck of the simulation is the searches in high dimensional space. Speeding up this process would yield great speed increase for the simulation. There are many routes that can be taken for general improvements, namely approximate solutions and code efficiency changes. One particularly interesting thing to examine would be improving search speed using temporal coherence.
Presentation
The slides for the in-class presentation for this project can be found here.
References
A.A. Efros and T.K. Leung. Texture synthesis by non-parametric sampling. In Proceedings of the Seventh International Conference on Computer Vision, Corfu, Greece, 1999.
L. Wei and M. Levoy. Fast texture synthesis using tree-structured vector quantization. In Proceedings of SIGGRAPH 2000, 2000.