Intro to Relativity

Hello there!

If you are reading this post, I must assume you found it to do a background reading for the Summer school.

If you did not find this for the summer school, no probs! Just go ahead, and try to have a good read.

First off, the Summer school has been partitioned in 4 ways: (i). Introduction to General Relativity, Relativistic effects in Astronomy- almost purely qualitative,(ii). Electromagnetism, electromagnetic radiation in Astronomy, (iii). Basics of Astronomy- Optics, filters, and observations,  (iv). Machine learning and Deep learning in Astronomy.

Please note I will not put a lot of math, but instead refer you to corresponding books to do the same. Keeping in mind the session is for First year undergrads, I am trying to do some Mathematica coding for the same (if I am not lazy, that is 😛)

Finally, don’t hesitate to post questions if any on the comments section!

Other lectures in this series:

1. Radiation in Astrophysics

2. Basics of Astronomy: Photometry and allied stuff

4. Deep learning in Astronomy

If there is anyone interested to have  a discussion on any of these topics, you could comment here, of just search for the title on Quora, wherein I have uploaded all of this stuff under Abstracted Abstract Science.

General relativity, you know, that which you think you know because you read A Brief History of Time, but actually you don’t! (Can’t help it, I am watching too much of Last Week Tonight 😛 )

Newton, in the 1600s, was phenomenal in his claim that the thing which causes an apple to fall down to Earth also resulted in the Earth going around the Sun. And thus, the Newtonian gravity as we know it was born.

Newtonian gravity assumes mass to be a substance, which exerts a force, like how paper sticks to comb, and it is given as an inverse-square rule. Inverse square rules are present when the Force depends on the inverse of square of the distance between the carriers– charge, mass ,etc, like so: F = \frac{k}{r^2} , where k would represent the mass/charge, etc. And, whatever “force information” was passed from one body to the other, was done instantaneously; i.e, if a small disturbance were made to body-A, it would be felt instantaneously by body-B. This was the assumption, though.

A fundamental change occured in this perspective, when Einstein proposed his Theory of Special Relativity and Theory of General relativity. Let’s take this one by one.

Special relativity: There are 2 seminal rules of Special relativity which make it special. They are:

  1. No frame of reference is special.
  2. There is one absolute speed above which there exists no speed, need not be the speed of light. This speed is same in all frames of reference.

Now, the proofs, etc. arising from these equations I shall refer you to read Landau and Lifshitz, Classical theory of fields, which will also get you started with Tensor algebra. But moving on, there are some specific results which are obtained:

  1. Speed of light in vacuum was found to be invariant in all frames of reference, and nothing in the world can travel faster than this speed- 3\times 10^8 m/sec. That is, no information can travel faster than this speed, and instantaneous transfer of information is not possible.
  2. If you, as a massive observer go faster, time will start to flow slower to you, or distance will appear to be shorter to you.
  3. Mass and energy are equivalent, and are related by the famous equation E=mc^2.

Special relativity starts creeping into all equations at really high temperatures/velocities/energies. It appears in a branch of physics called High Energy Physics, where particles are bouncing over at really high velocities.

Space and time as we know, it turns out are quite coupled, and this can be seen from Special relativity due to Time dilation/length contraction, due to the action of something called the Lorentz Transform. Now, for any given coordinate system, we must have a measure of “distance between two points”.

This is given by something called the interval, defined as:

ds^2 = -c^2dt^2+dx^2+dy^2+dz^2

This is defined in the 4-D space-time, with the regular notation of (-t,x,y,z), and c is the universal speed, which we write to be the speed of light. This is defined as the distance measure between two points at given locations in space, and at two different times, and how it compares with information being sent from one to the other.

Before we proceed, there is an important concept which must be appreciated: Spacetime diagram.

If we consider only one spatial coordinate, and the time coordinate, we get a diagram as shown in Fig:1, and if we are to consider one more spatial coordinate, we will get the cone in Fig:2. The light cone gives the path for light between two ‘points in spacetime’.

Let’s say we have two points A (x1,y1,z1) and B (x2,y2,z2), at two different instances of time t1 and t2 respectively. Let us say, a person at A at time t1 wants to send a signal, a flash of light to B. Then, the interval gives us roughly the measure of whether the signal will reach B at t2 or not. For the signal to reach B at t2, we must have ds^2 = 0, for we must satisfy the holy relation distance = speed x time!

So, in the spacetime diagram, if a curve has slope>1, it would mean a velocity<c, and vice-versa. Hence, since no massive object can go faster than speed of light (pliss don’t bring scifi, or tachyons here!),  all of the curves should lie inside the light cone. The surface of light cone gives us the maximum speed at which information can travel between two points in spacetime.

If suppose there are two points which can be connected only by a line with slope<1, it would mean information cannot be passed between those two points of spacetime. The upper cone is known as absolute future, and the lower cone known as absolute past. Information can only travel in future, so we almost always neglect the past light cone.

Now then, I suppose the smart ones reading this must have noticed how the equation for interval is quite similar toe the normal ‘distance formula’, except for the negative sign on the time term. You may ask why does that negative sign come about? Turns out, the answer lies in the Lorentz transformation, and Group theory for the Lorentz group. So, for now, we just take the term for granted with a simple reference as given below:


Fig:3 gives a feel for what’s Lorentz transform. It is easy enough to be derived on paper, but it has vast inferences. Every major theory in physics, if proposed, is demanded Lorentz invariance, i.e, if you perform an experiment in one frame of reference, it should not change in another frame of reference.

Lorentz invariance brings us to the concept of 4-vectors. Basically, these are vectors constructed out of (t,x,y,z), which we have seen before, but they have one important property.

Dot products of 4-vectors are Lorentz invariant.

We have many four-vectors available: Energy-momentum, Displacement, Electromagnetic 4-potential among others. Since these dot products are invariant, they are conserved during processes, and used for calculation.

Now, I have been woefully ignoring one quantity: time. We have happily written time, but what do we mean when we say ‘time’? In which frame is it measured? And from where do you measure the displacement?

The time measured is generally called Proper time , given by \tau. This time the time measured in the frame of reference of object of interest. We can either go into the frame of reference of object of reference and do our analysis, or stay in lab frame and do our analysis. Most times, it is easy to go into particle frame, since we will miss a  factor of Lorentz factor \gamma. But it is more upto physicists themselves. At least, I find it easier in the frame of Center of mass of process and do analysis, since the momentum will be conserved and equals 0.

There are many effects of Special relativity, which can be seen primarily as Length contraction, Time dilation, and Superluminal motion. The first two are trivial to be derived on your own (the third one too), while the third one has an interesting effect.


In Fig:4, I have shown a simplified diagram of a body at A moving over to B. Assume, the body emits a photon at A,t_1, and another one at B,t_2. Now, without any approximation, solve for the velocity of body as seen by observer at O. A simple calculation will result in the transverse velocity being:

v_T = \frac{v\sin\theta}{1-\frac{v\cos\theta}{c}}

That velocity can go greater than speed of light! Hence, superluminal motion. This is important to people who study Relativistic jets, which are almost always superluminally moving away from us, and formed generally in Quasars.

Now, let’s add weight to our content and move on to General relativity.

General relativity: There are no postulates/axioms per se, but there are equations. And tensorial equations, at that, and this starts with what’s known as Space-time. And the gist of the theory is this: Matter, and energy are equivalent, and bend space-time. This ‘bending’ of space-time is what is perceived by other bodies as ‘Gravity’.

More specifically, if we consider a ‘vectorized notation’ of the interval, we get something like below:

ds^2 = \eta_{\mu\nu}X^{\mu}X^{\nu}, where,

\eta is known as the Metric tensor. The metric tensor quantifies the kind of space-time present- basically, it is a measure of Geometry of spacetime. Whenever we use \eta_{\mu\nu}, we are referring to what’s called the Minkowski metric. Minkowski metric is what symbolizes flat space-time, or a space-time where there is no mass present to screw it up geometrically. For small masses, and for any location far away from a massive object, we assume spacetime is flat. The notation of metric tensor, and the corresponding ‘matrix equations’ shall be explained in the class.

Now, what if we have a massive object? In such a case, we will need to define an appropriate metric, taking into account the symmetries of the problem involved. One such metric is known as the Schwarschild metric, and is given by:

ds^2 = -(1-\frac{r_s }{r})c^2dt^2+(1-\frac{r_s }{r})^{-1}dr^2+r^2(d\theta^2+\sin^2 \theta d\phi^2)

Crazy huh? This is the metric which generally represents spherically symmetric massive objects, and was most famous for postulating the Black holes (there, it got fancier!). Basically, the metric is not defined for r<r_s, a point known as the Schwarschild radius. Thus, it came to be known as the Event Horizon, beyond which everything is unknown!

This is the metric, which primarily causes fancy observation effects in the universe, some of the being Gravitational lensing, Photon sphere, Mercury’s orbit’s precession. But before that, there is an important concept we must visit.

Geodesics: Geodesics are paths taken by a ‘test mass’ in a given space-time. They can also be called as the ‘paths with shortest distance’ connecting two points in a given spacetime. For example, the geodesic on a plane is a straight line, while on a sphere, it is the arc of a great circle of the sphere (center of the circle is at center of the sphere) connecting the two points. Any test particle, by default, would tend to move along the geodesic.

Now, “observers”, as people who are doing the measurements called in Relativity, are assumed to travel along geodesics. One can obtain the geodesics by minimizing the action (something which I might touch upon in the class), and solving the geodesic equation, which goes about as:

\frac{d^2 x^\beta}{d\tau^2}+\Gamma^\beta _{\alpha \nu}\frac{dx^\alpha}{d\tau}\frac{dx^\nu}{d\tau} = 0, where

\tau represents proper time, and \Gamma is known as the Christoffel symbol(s).

Well, that was just to show you a flavor of math in GR, and I don’t hope you to understand, or even comprehend that. But take it on face value: solve for x to obtain the equation of geodesic of the observer, for a given ‘metric’.

Now, let’s talk about some interesting implications of these equations, and the metric. First off, let’s rewind to Solar eclipse in 1919, when Arthur Eddington with others found starlight from stars near Sun was bent. This value of ‘bending’ was found to be as predicted by GR, due to distortion of spacetime. This was known as Gravitational lensing effect.

Now, I suppose most of you would remember the ‘Black hole’ from the Nolan movie Interstellar:


Look at Fig:5. On the inner side of the ring, you can see a ring of light, right? That’s something known as a photon sphere (not really, but something similar). Incidentally, this is also a great example of lensing, as the top of the disk is actually at the “back”, but has been lensed to appear on the top.

Now, for Photon sphere, one solves the Geodesic equation with the Schwarschild metric. But that itself is not enough. One must also solve the ‘interval’ for ds^2 = 0, and for simplification, assume the photon goes in a circular orbit, making dr=0. Solving these two equations together, one can solve for the photon sphere, which has been found to have a radius r_c = 1.5r_g, in terms of the Schwarschild radius.

The photon sphere radius is actually an unstable orbit, if one looks at the potential. So technically, any small perturbation will destroy the photon sphere!

Why does all this happen?

Since space-time is curved by mass, the path of light is bent by the curved spacetime, like a lens. If it was just a bend, one would call it deviation from path. But, in the most extreme case, the bend is so much that the photon is trapped in a circular orbit, going around the source of mass, creating a photon sphere/ring or whatever you wanna call it!

Now, there is an interesting consequence of the Gravitational potential in GR, in Schwarschild metric. And that is, it predicts the precession of the major axis a planet quite close to its star (please note its a subjective notion). In our solar system, Mercury is close enough to experience this precession of axis, and was on of the earliest tests to prove the validity of General relativity.

There are many interesting consequences of GR, and the most fancy of them would be the ‘Wormhole’. Frankly speaking, I tried to understand the math behind it myself, but thoroughly got lost. So, I wouldn’t discuss it here, but we can have a fancy chat at the Summer school.

Oh yeah, almost forgot. Gravitational waves. Probably the coolest stuff in GR. These are an effect of  ‘tensor perturbations’ in the metric, which propagate outwards as waves. I shall discuss this in the lecture on Radiation.

Now, there are a LOT of concepts I haven’t even touched: Freidmann-Robertson-Walker metric, Einstein’s equations, Hubble parameter, Black holes- de Sitter metric, Penrose diagrams, to name a few. I shall point to resources which provide excellent introduction to these topics- but beware: you will need Tensor calculus to be able to rigorously do appreciate GR in full glory!


  1. Classical theory of Fields, Landau and Lifshitz.
  2. Theoretical Astrophysics Vol:1 and Vol: 3, by T.Padmanabhan.
  3. Geodesic derivation here.
  4. Lensing is shown here.
  5. Horrible calculation of Mercury’s perihelion precession here.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s