VLBI Demystified:van-cittert-zernike-theorem

Currently Reading:

2.1 : Van Cittert Zernike Theorem

2.1.0: Signal From The Sky

Before we start, let's clear something up. So in the previous chapter, we only look at a single frequency. But in reality, the signal is composed of many frequencies. So in VLBI, we do a fourier transform first to get the frequency components and then follow by correlation. So keep in mind that what are going to do is to analyze the steps of a FX correlator:

F \text{ step}:

At each station, take Fourier Transform (FFT) of incoming voltage signal.

X \text{ step}:

Get time-averaged multiplication of the Fourier coefficients from step 1.

First we analyze the F \text{ step} :

We will take a different approach to prove van Cittert Zernike Theorem. The content is referenced from Chapter 1 of Synthesis Imaging in Radio Astronomy II.

Suppose we fit the universe in a 3D Cartesian grid (x,y,z) .

Under the coordinate system, suppose at some remote location \vec{R} , it experiences a band-limited electric field as a voltage function of time shown below :

Now we can do Fourier Transform to get its frequency components like the following:

Ignore the fact that we cannot take the integral with time to \pm \infty , we will come back to it later.

From here we will work in the frequency domain, and given the linearity of Fourier Transform, we can just focus at one particular frequency, say \nu :

Now suppose a station on earth is a located at \vec{r} , the fourier coefficient at frequency \nu the station observes is the sum of all different \vec{R} over the sky at time t :

Where \Rho_\nu(\vec{R},\vec{r}) is referred as the propagator, which describe how the electrical field is propagated from \vec{R} to \vec{r} .

Next we make 2 simplifications:

1.:

Ignore polarization, the electric fields are just scalars.

2.:

Instead trying to observe with a 3D coordinate \vec{R} = (x,y,z) , we assume all sources lie on surface of a giant sphere which has radius |\vec{R}| , so now \vec{R}=(\theta,\phi) .

So with the everything is on a spherical surface simplification, we can rewrite E_\nu(\vec{r},\nu)|_{t} as :

In summary, the electric field a station at \vec{r} see for frequency \nu over the sky is:

2.1.1: Correlation of Sky Signals

Now we analyze the X \text{ step} :

Now we have two stations located at \vec{r_1}, \vec{r_2} . Let's calculate their received signals' correlation (time-averaged multiplication)

Keep in mind that V_\nu(\vec{r_1},\vec{r_2}) is the time-averaged of the frequency coefficient at frequency \nu .

Some side note: The difference between E_\nu(\vec{r_1}), E_\nu(\vec{r_2}) is a time delay. Say if the electric field wave front arrives at \vec{r_1} at t and the same wave front arrive at \vec{r_2} at t+\tau , then we can write the correlation like:

Anyway, let's evaluate V_\nu(\vec{r_1},\vec{r_2}) :

If you are confused about how t and T , see the figures below:

Figure :

At station \vec{r_1} , for an particular dS_1 , the electric field over T

Figure :

At station \vec{r_2} , for an particular dS_2 , the electric field over T

As you can see, every dt gives a E_\nu(\vec{r_1,\nu})|_{t} as time moves on within the integration time T .

To solve the above equation, we need to go over 2 import concepts:

1. \text{ Spatially Incoherence}:

The electric fields from different area of the spherical sphere are uncorrelated, which can be expressed in math as the following:

2. \text{ Solid Angle} (\Omega) \leftrightarrow \text{ Spherical Surface Area} (S):

So with the above 2 concepts, we continue with solving the correlation V_\nu(\vec{r_1},\vec{r_2}) :

Next we make the 2 following approximations:

1. \text{ For Denominator: } |\vec{R}-\vec{r_1}| |\vec{R}-\vec{r_2}|:

Because the sky spherical sphere has radius |\vec{R}| that is far greater than |\vec{r_1}| and |\vec{r_2}| , thus:

So apply this approximation back to the correlation V_\nu(\vec{r_1},\vec{r_2}) :

2. \text{ Binomial Approx: } (1+x)^n \approx 1+nx :

So apply this approximation back to the correlation V_\nu(\vec{r_1},\vec{r_2}) :

Now we convert the spherical surface area to solid angle: d\Omega = \frac{dS}{|\vec{R}|^2}

2.1.2:

(u,v)\leftrightarrow(l,m)

In this section we will analyze further more on the correlation result V_\nu(\vec{r_1},\vec{r_2}) and eventually show you the 2D Fourier Transform relationship between baseline observation and sky image.

Starting from the correlation formula from 2.1 Van Cittert Zernike Theorem - 2.1.1: Correlation of Sky Signals :

Notice that the function only cares about the difference between the two stations location vectors, so we can rewrite the function as:

\vec{b} is referred as the baseline vector, the same vector shown in 1.1 $\cos$ Interferometer - 1.1.3: Put Geometric Vectors on Cartesian Grid

Now we set up a 3D Cartesian coordinate system where \vec{b} has coordinate:

and the unit vector \hat s has coordinate:

Applies this new coordinates to V_\nu(\vec{b}) , then we have:

Now let's see how to convert solid angle d\Omega to Cartesian format (l,m,n) :

Using Jacobian Matrix:

So now we can have a correlation function with only Cartesian coordinate:

Now let's see what has to be done to get rid of the annoying \sqrt{1-l^2-m^2} term.

Let me write down the correlation here again:

Now if we treat \hat s as a small solid angle centering around phase center \hat s_0 , i.e, just integrate the solid angles close to the phase center unit vector \hat s_0 , then we can make the following simplification:

which implies that :

The above result tells us that all the points included in the integral lie on the same plane.

So now if we set up our coordinate system such that \hat s_0 has coordinate being:

then we can have the \vec{\sigma} to be:

Now before we apply the change to the correlation function, we need to realize that the \vec{b} is (u,v,w) instead of (u,v,0) because we set the coordinate system to have \hat s_0 = (0,0,1) .

Now we apply the changes to the correlation function:

Where looking at e^{-i2\pi w} , we can express it as:

So we have:

Next section we will see how to remove the e^{ - i2\pi \nu (\hat s_0 \cdot \vec{b})/c } term by inserting artificial delay to the received signals.

2.1.3: Delay For Phase Center

Let's start from the raw correlation function:

V_\nu(\vec{r_1},\vec{r_2})&= \langle \, E_\nu(\vec{r_1,\nu})|_{t} \;,\;E_\nu(\vec{r_2},\nu)|_{t} \, \rangle \\ &= \frac{1}{2T} \int_{-T}^{T}\left[ \int \xi_\nu(\vec{R_1},\nu)|_{t} \frac{e^{i2\pi \nu |\vec{R_1}-\vec{r_1}|/c}}{|\vec{R_1}-\vec{r_1}|} dS_1 \; \int \xi^*_\nu(\vec{R_2},\nu)|_{t} \frac{e^{-i2\pi \nu |\vec{R_2}-\vec{r_2}|/c}}{|\vec{R_2}-\vec{r_2}|} dS_2 \right] dt \\ &= \frac{1}{2T} \int_{-T}^{T}\left[ \iint \xi_\nu(\vec{R_1},\nu)|_{t} \; \xi^*_\nu(\vec{R_2},\nu)|_{t} \frac{e^{i2\pi \nu |\vec{R_1}-\vec{r_1}|/c}}{|\vec{R_1}-\vec{r_1}|} \frac{e^{-i2\pi \nu |\vec{R_2}-\vec{r_2}|/c}}{|\vec{R_2}-\vec{r_2}|} dS_1 dS_2 \right] dt \\ = V_\nu(u,v,w) &= \int I_{\nu}(\hat s) \large e^{-i2\pi \nu\hat s \cdot \vec{b} /c} \frac{1}{\sqrt{1-l^2-m^2}} d\Omega \\ &(\text{with only considering area around $s_0$ : } \hat s \approx \hat s_0 + \vec{\sigma} \; \; , \; \; \vec{b} =\vec{r_1} -\vec{r_2} \; \; , \; \; \sqrt{1-l^2-m^2} \approx 1 )\\ \\ = V_\nu(u,v,w)|_{s_0} &= \iint {\color{green} I_\nu(l,m)} \Large e^{i2\pi \frac{\nu}{c} [-( \hat s_0 + \vec{\sigma}) \cdot(\vec{r_1} -\vec{r_2}) ]} dldm \\ &= \iint {\color{green} I_\nu( \hat s_0 + \vec{\sigma} )} \Large e^{i2\pi \frac{\nu}{c} ( \hat s_0 + \vec{\sigma} )\cdot \vec{r_2} } \Large e^{-i2\pi \frac{\nu}{c} ( \hat s_0 + \vec{\sigma} )\cdot \vec{r_1} } dldm\\ &= \iint {\color{green} \langle \xi_\nu( \hat s_0 + \vec{\sigma} ,\nu)|_{t} \;,\; \xi_\nu(\hat s_0 + \vec{\sigma} ,\nu)|_{t} \rangle \; } \Large e^{i2\pi \frac{\nu}{c} \hat s_0 \cdot \vec{r_2} } \; { \color{blue} \Large e^{i2\pi \frac{\nu}{c} \vec{\sigma} \cdot \vec{r_2} } }\;\; \Large e^{-i2\pi \frac{\nu}{c} \hat s_0 \cdot \vec{r_1} }\; { \color{blue} \Large e^{-i2\pi \frac{\nu}{c} \vec{\sigma} \cdot \vec{r_1} } }\;\; dldm\\ &= {\color{blue} \iint }{\color{green} \frac{1}{2T} \int_{-T}^{T} \xi_\nu( \hat s_0 + \vec{\sigma} ,\nu) \xi^*_\nu(\hat s_0 + \vec{\sigma} ,\nu) dt \; } \Large e^{i2\pi \frac{\nu}{c} \hat s_0 \cdot \vec{r_2} } \; \Large e^{-i2\pi \frac{\nu}{c} \hat s_0 \cdot \vec{r_1} }\; { \color{blue} \Large e^{i2\pi \frac{\nu}{c} \vec{\sigma} \cdot (\vec{r_1}-\vec{r_2} ) }\;\; dldm}\\ &\color{blue}(\text{notice } \vec{\sigma} \text{ vary with }dldm)\\

Remember earlier we showed that:

So if we insert the multiplicative term \large \color{red}e^{ i2\pi \nu (\hat s_0 \cdot \vec{b})/c } , we will get a beautiful result of:

So now let's see how to achieve the same beautiful result by applying time delay on received signals:

Now the red multiplicative terms can be viewed as applying phase shifts for the received signals.

And we know that phase shift \tau correspond to time delay \phi :

So we can apply phase shift with time delay. In other words,instead of inserting a multiplicative term, we apply artificial time delay \tau_1 \; \& \; \tau_2 , on the respective station's received signal: :

Often time many texts will mention that the time delay operation (applying delay \tau_1 for station 1 and \tau_2 for station 2) conceptually makes the wave front from both stations arrive at the new axis formed from the \hat s_0 at the same time. See the simplification case of 1D sky and baselline visualization below.

Figure

As for 3D case, in many cases, people use earth center as the coordinates origin (0,0,0) . With that, after applying delay \tau_1 for station 1 and \tau_2 for station 2, from the visualization below, you can see that the wave fronts conceptually arrive at the uv -plane at the same time. Also notice that \color{yellow}\hat s_0 is pointing at the center of the lm -plane.

(u,v,{\color{blue}w})

\text{coordinates}

\color{red}\text{station } \vec{r_1}

\color{green}\text{station } \vec{r_2}

\text{earth center}: \\ \hat 0 = (0,0,0)

lm\text{-plane}

\color{red} \Large \tau_1 = \frac{\hat s \cdot \vec{r_1}}{c}

\color{green} \Large \tau_2 = \frac{\hat s \cdot \vec{r_2}}{c}

uv\text{-plane}

\color{yellow} \hat s_0 = (0,0,1)

Now we have a clean result what shows the 2D intensity at frequency \nu for the image plane (l,m) centered at s_0 has a 2D Fourier Transform relation with correlation of 2 stations on earth, where (u,v) is decided by the position vector form by the 2 stations:

The below visualization shows the visualization of the relations of (u,v,w), (l,m,0), \vec{b}, \hat s_0 .

Figure :

Baseline vector \vec{b} \text{ in }(u,v) projects onto \vec{\sigma} \text{ in } (l,m) plane.

\text{earth-centered }

\text{coordinates}

show

\text{telescope }

\text{locations}

\vec{b}:\text{baseline vector}

show

\text{image center}

lm\text{-plane}

\hat s_0: \color{black}\text{unit vector }

\text{pointing at } s_0

(u,v,w)

\text{coordinates}

show

uv\text{-plane}

show

\vec{b_p}:\text{proj of } \vec{b}

\text{onto } uv\text{-plane}

show all legends

source: So You Want to Do VLBI by Bob Campbell

2.1.4: Discrete Sampling

If we are able to measure all V_{\nu} (u,v) for every location (u,v) , then we can recover the intensity map I_\nu(l,m) perfectly.

But unfortunately, we can only have some of the (u,v) paris, which mathematically speaking, is like discrete-time sampling:

Where S(u,v) is the sampling function:

B(l,m) is referred as the synthesized beam or point spread function: