Dust flux, Vostok ice core

Dust flux, Vostok ice core
Two dimensional phase space reconstruction of dust flux from the Vostok core over the period 186-4 ka using the time derivative method. Dust flux on the x-axis, rate of change is on the y-axis. From Gipp (2001).

Monday, April 7, 2014

Reconstructing phase space

I have been reconstructing phase space portraits to study the dynamics of complex systems at various times on this blog. But I still haven't conveyed why I think the method is as powerful as it is. 

We find ourselves studying a complex system, but we know very little about it except for some observations. How we proceed depends to some extent on the model we have in our minds of how the system might work. For this analysis, I am assuming that there is some series of differential equations that will describe the evolution of the system through time. I will also assume that we have no idea what those equations are.

The dynamics of the system may be represented by a series of vectors in a two- or higher-dimensional space, an example of which is depicted below.

We cannot perceive the vectors directly--all we can do is observe a trajectory of the system, and try to infer the pattern of vectors that will give rise to it. In the above cylinder, I have drawn a couple of trajectories that each represent the forward time evolution of the system from the initial condition (the red dots). Notice that although the two dots being close together, their trajectories rapidly diverge.

The system depicted above represents the phase space for a damped pendulum, but could just as easily represent the phase space for the price of a particular gold stock, where one of the axes represents the share price, a second axis represents the market consensus forecast of the future price of gold, and the third axis represents the market consensus of future costs. The efficient-market hypothesis tells us that the information plotted on the other axes is already be embedded in the price. If so, then we should be able to use the price alone to reconstruct the dynamics of the price-gold price-cost system. The method used to reconstruct the phase space is a process of unfolding this extra geometric information from a single time series.

I'll illustrate the approach using a well-known example--the Lorenz attractor. It is challenging to discuss three-dimensional objects in a 2-d medium. To overcome this somewhat, I've made a few screenshots of different projections of this attractor (you can play with it here - the page is in French, but scroll down and you can play with the butterfly). 

Different two-dimensional projections of the same three-dimensional object

Another nice place to play with this function is here. At this site, you can grab and rotate the figure as it is being plotted. You will probably want to modify the number of points to 2000 (seems to be the maximum).

Hopefully by this point, you can see why this function is sometimes described as being in the shape of a butterfly's wings.

We can build our own plots of the Lorenz function. 

This is one that took about two minutes in excel. Although I've only plotted x vs y, please note that calculating z for each time step is a requirement, as it is needed for all the subsequent calculations of x and y.

Imagine that this function represents the state space of the price of a gold stock (on one axis), with the market's consensus estimates for the future price of gold and the future costs to the company on the other two axes. We are in the position of trying to reconstruct the dynamics depicted in the above diagrams. However, the only data we have is price. We don't know the future price of gold. We don't even know the market consensus for the future price of gold--we can only sample a limited number of blogs, and they all seem to say that gold will soon reach $50,000 per ounce (or $200 per ounce, if you prefer).

How do we reconstruct the essential dynamics of the system from a single time series? Suppose that instead of having all the values for x, y, and z, we only had the values for x. The methodology is described starting here. We can plot our time series against either an estimate of its time derivative (time derivative method), or against a lagged copy of itself (the time delay method). 

The time delay method is generally preferred because the errors tend to be smaller (although for financial time series, the errors are so small that perhaps they won't matter). The choice of a lag will influence the usefulness of the plot. If the lag is zero, then the plot will consist of a single diagonal line. If the lag is very small, the plot will only deviate slightly from a diagonal line.

Increasing the lag gives us a more useful plot. 

This plot is very similar to the x vs y plot shown above and captures the essential dynamics of the 3-dimensional graphs shown higher up. We would state that this reconstructed state space is topologically equivalent to the x vs y plot above. 

This exercise is the equivalent of taking the price data alone and reconstructing the phase space that shows the relationship between the price and the future expectation of the gold price (or perhaps estimate of future costs). This is possible because all of the information in the expanded system is embedded within each individual time series. Consequently, the series of observations of price contain the information to reconstruct the geometry of the system, even though we don't necessarily know what the additional axis (axes) is (are).

As an aside, if you try this using y (from the Lorenz function), you will get much the same result. However if you only use z, it doesn't seem to work. As an exercise, dear reader, see if you can tell me why that should be the case.

Increasing the lag further causes distortion.

If the lag is too large, the graph loses its coordination.

Neat! But not so easy to interpret. The choice of a lag is an important one, as it determines how well the geometry of the system is represented in your reconstruction. There are formal prescriptions for selecting a lag, with Abarbanel strongly favouring the use of mutual information over the first minimum of the autocorrelation function. However, finding the first minimum of the autocorrelation is a lot easier.

No comments:

Post a Comment