Noise and the Shannon Limit
Overview
In any communication system, information is sent from a source node to a destination node in the form of a signal. The signal could take the form of a time varying voltage or current, a stream of light pulses, or a modulated radio or microwave carrier wave. As the signal propagates through the transmission medium it will become weaker due to attenuation (loss of signal power). The degree to which the signal is attenuated will depend upon the nature of the transmission medium and the distance over which the signal must travel. The signal may also undergo some kind of distortion (a change in the amplitude, frequency, or phase of the signal) due to imperfections in the transmission medium or signal processing hardware.
Both attenuation and distortion are characteristics that can be predicted for a given system. The effects of distortion can be mitigated by applying appropriate hardware and software engineering principles to the design of the communication system, and attenuated signals can be amplified or regenerated at the receiver before sampling takes place. The main factor that limits the performance of a communications system is noise. We can define noise as any form of electromagnetic interference that has the potential to alter a signal as it travels from a transmitter to a receiver, and within the receiver itself.
Although we can expect to encounter many different types of noise in a communication system, we can break it down into two general categories - external noise originating from outside the system, and internal noise that is inherent to that system. The most disruptive form of external noise is impulse noise - sudden bursts of electromagnetic energy such as those generated by vehicle ignition systems and heavy-duty electrical equipment.
The effect of impulse noise on a digital signal
Impulse noise tends to be of short duration and relatively high amplitude. Communication systems that are subject to a significant degree of impulse noise, such as industrial control systems, make use of shielded cables such as shielded twisted pair (STP) or coaxial cables, or fibre optic cables which are immune to electromagnetic interference. For systems carrying analogue information such as voice telephony, impulse noise is usually only a minor annoyance, but it is the primary cause of errors in digital communication.
The most common form of internal noise is thermal noise, which is caused by the thermal agitation of electrons in a conductor, and is a function of temperature. Thermal noise is present in all electronic devices and transmission lines and cannot be eliminated. It is in fact the main source of noise in electronic systems, and is also the main factor limiting the performance of a communication system. In the rest of this article, therefore, we will be concentrating on thermal noise and the limits it imposes on the data-carrying capacity of a communication channel.
Measuring signals and noise
Measuring the signal strength in a communication system enables us to get an idea of how the system is performing. We can measure the signal at two different point in the system, for example, to determine the degree to which the signal has been attenuated. We can also measure the noise in the system at various points, which will give us an idea of the difference between the strength of the signal and the level of the background noise. This is especially important for a receiver, where a sufficiently high level of noise can drown out a weak signal.
Signal strength is measured using the decibel (dB), a unit of measurement corresponding to one tenth of a bel (B). The bel was originally conceived as a power ratio unit that could be used to quantify signal loss in telegraph and telephone circuits. It is named after the Scottish scientist and engineer Alexander Graham Bell (1847-1922), who is credited with the invention of the telephone. A quantity expressed in bels is the decimal logarithm of a power ratio:
B = log_{ 10} | P_{ 1} |
P_{ 2} |
where P_{ 1} and P_{ 2} are power levels.
A logarithmic quantity expressed in decibels is called a level, and the difference between two levels can also be expressed in decibels. A difference between two power levels of one decibel represents a power ratio of 10^{1/10}. We can express a power ratio in decibels using the following formula:
Power ratio (dB) = 10 log_{ 10} | P_{ 1} |
P_{ 2} |
Although the decibel is often defined with respect to power, it can also be used to express the difference between two signal amplitude levels. This can mean either the difference between two voltage levels or two current levels, although the amplitude of an electrical signal is usually measured in volts - in most cases, the root mean square (rms) voltage. If you are familiar with basic electrical theory, you will recall that voltage is directly proportional to the square root of the signal's power:
P = | V^{ 2} | ⇒ V^{ 2} PR ⇒ V = √PR |
R |
where P is power, V is voltage and R is resistance. A difference between two voltage levels of one decibel thus represents a voltage ratio of 10^{ 1/20}, and we can express a voltage ratio in decibels using the following formula:
Voltage ratio (dB) = 20log_{ 10} | V^{ 1} |
V^{ 2} |
The way in which we express a ratio in decibels depends on whether it is a power ratio or a root-power ratio. A root-power ratio is the ratio of two root-power quantities such as voltage or current, the square of which (in linear systems) is proportional to power. A voltage ratio is thus a root-power quantity. When expressing a power ratio, therefore, the number of decibels is ten times its logarithm to base 10, but when expressing a root-power ratio, the number of decibels is twenty times its logarithm to base 10. In other words, the two decibel scales differ by a factor of two. In order to understand why this is so, we need to refer to the third law of logarithms, which states:
log_{ b} (m^{ n}) = n log_{ b} (m)
Or, in words, the log to base b of a number raised to a power n is equal to n times the log to base b of that number. From this, we can derive:
log_{ 10} x^{ 2} = 2 log_{ 10} x
Depending on the context in which they are used, decibels can express either a change in value or an absolute value. If the output of an amplifier is a signal that has twice the power of the input signal, the amplifier is said to have a gain of 3dB. Conversely, if the power level of a transmitted signal has fallen by fifty percent by the time it reaches the receiver, it is said to have been attenuated by 3dB. In the first case, the power ratio is 2/1, and we can calculate the gain as:
10 × log_{ 10} (2/1) = 3.01 ≈ 3 dB
In the second case, the power ratio is 1/2, and we can calculate the attenuation as:
10 × log_{ 10} (1/2) = -3.01 ≈ -3 dB
Decibels can also be used to express an absolute value. In this case, the number of decibels expresses the ratio of a value to some fixed reference value, and a suffix is usually added to the decibel symbol (dB) in order to indicate what that reference value is. In signal processing, for example, signal power is often specified with reference to a fixed reference value of 1 milliwatt, so the suffix is "m". For example, a signal power of 100 milliwatts could be expressed as:
10 × log_{ 10} (100/1) = 20 dBm
If signal amplitude is specified with reference to a fixed reference value of 1 volt, the suffix would be "V". For example, a signal power of 100 millivolts could be expressed as:
20 × log_{ 10} (100/1000) = -20 dBV
Nyquist and Hartley
In 1927, the Swedish-American electronic engineer Harry Nyquist (1889-1976) determined that the maximum number of independent pulses that could be sent over a telegraph channel per unit time is equal to twice the analogue bandwidth of the channel. This idea can be expressed mathematically as
f_{ p} ≤ 2B
where f_{ p} is the maximum number of pulses per second and B is the bandwidth in hertz. The quantity 2B would later become known as the Nyquist rate*. Nyquist published his findings in his papers "Certain Factors Affecting Telegraph Speed" (1924) and "Certain topics in Telegraph Transmission Theory" (1928). Nyquist's work suggests that the amount of information that can be transmitted over a given channel is proportional to the analogue bandwidth of that channel, regardless of the actual frequencies involved.
This concept, which would become an underlying principle of telecommunications, is embodied in Hartley's law, named after the American information theorist Ralph V. R. Hartley (1888-1970) who first proposed it while working at Bell Laboratories. In his 1928 paper "Transmission of Information", Hartley stated that "the total amount of information that can be transmitted is proportional to frequency range transmitted and the time of the transmission."
Hartley’s research was focused on quantifying the number of distinct voltage levels at which pulses could be transmitted over a given channel. He reasoned that this must depend at least in part on the ability of the receiver to distinguish between different voltage levels (i.e. the sensitivity of the receiver) and the dynamic range of the received signal (in this context, the range of voltages that the receiver can actually detect). Hartley asserts that each distinct voltage level can carry a different message. If the dynamic range of the transmitted signal is restricted to ±A volts, and the sensitivity of the receiver is ±V volts, then the maximum number of unique messages M is given by:
M = 1 + | A |
∆V |
As you probably realise, Hartley’s "messages" equate to signal symbols. The number of bits we can encode per symbol depends on the number of different symbols we can transmit, which in turn depends on the number of different signal levels we can generate. For a one-bit message, we need two signal symbols; for a two-bit message, we need four signal symbols; for three bits we need eight symbols, and so on. Hartley’s maximum line rate (or data signalling rate) R can thus be expressed as:
R = f_{ p} log_{ 2} (M)
where f_{ p} is the Nyquist rate (or pulse rate or symbol rate or baud).
As we saw above, Nyquist had already established that the number of pulses it was possible to transmit on a channel is twice the analogue bandwidth in hertz. Hartley was thus able to combine his formula with that of Nyquist to obtain a new formula, giving the maximum data signalling rate based on channel bandwidth:
R ≤ 2B log_{ 2} (M)
Hartley’s formula indicates that the data signalling rate achievable for a given channel will be proportional to the channel’s analogue bandwidth, although it doesn’t tell us what data signalling rates we can realistically expect to achieve when we have to take transmission impairments into account, including, of course, noise. Nevertheless, the work of Nyquist and Hartley formed the basis for a more complete theory of information and transmission, as we shall see.
* The term Nyquist rate is used differently in the context of signal processing, where it specifies the minimum rate at which a signal must be sampled at the receiver in order to recover all of the information it carries. Its value is twice the highest frequency (called the Nyquist frequency) of the signal to be sampled.
Thermal noise
As we saw above, thermal noise (also variously called Johnson noise, Nyquist noise, or Johnson-Nyquist noise) is caused by the thermal agitation of charge carriers in a conductor, and is thus a function of temperature. The charge carriers in question are usually, though not always, electrons. Thermal noise was first discovered in 1926 by the Swedish-born American electrical engineer John B. Johnson (1887-1970) whilst working for Bell Labs. Johnson described the phenomena to his colleague Harry Nyquist (see above) who came up with an explanation for his results. Johnson published details of his discovery in July 1928, in a paper entitled "Thermal Agitation of Electricity in Conductors".
In Nyquist’s similarly titled paper "Thermal Agitation of Electric Charge in Conductors", which was published in the same month as Johnson’s paper, he describes how the electromotive force generated by the thermal agitation of electrons in a conductor can be determined by applying principles of thermodynamics and statistical mechanics, and states that thermal noise is purely a function of temperature, resistance and frequency. He makes reference to the equipartition law, which relates the temperature of a system to its average energies, as it applies to harmonic oscillators. The law states (in this context) that:
⟨H⟩ = k_{ B}T
where ⟨H⟩ is the average (potential + kinetic) energy of the oscillator, k_{ B} is Boltzmann’s constant (1.38064852 × 10^{ -23} J/K), and T is temperature. The "harmonic oscillators" in a circuit are the charge carriers (electrons), and ⟨H⟩ represents the noise power density in watts per hertz (W/Hz). Thermal noise can be considered to be white noise since it has the same power density throughout the frequency spectrum (up to around 80 gigahertz, depending on temperature), so multiplying the above equation by bandwidth B gives us the noise power N:
N = k_{ B}TB
One unavoidable feature of communication systems is that signals are attenuated as they propagate from a transmitter to a receiver. This is a particular problem in wireless systems, because signal power density decreases as the distance between the transmitter and the receiver increases, in accordance with the inverse-square law; the received signal power is inversely proportional to the square of the distance. Even so, if it wasn't for the fact that the receiver itself generates internal noise, even extremely weak signals could be detected.
We saw above that thermal noise is a function of temperature, resistance and frequency, and that, because the power density is essentially uniform across the usable frequency range, the noise power will be directly proportional to bandwidth. In a receiver, thermal noise is generated by resistors and other resistive components inside the receiver. The level of this internally generated noise is the receiver's noise floor. In order for the receiver to be able to differentiate between the internally generated noise and a received signal, the signal level must exceed the noise floor by a sufficient margin. It is important, therefore to be able to calculate the level of thermal noise generated by the receiver.
Thermal noise calculations
Let's start by thinking about how we can determine the thermal noise in a circuit with a known resistance. We can model the circuit as a Thevenin equivalent circuit consisting of a voltage source, representing the thermal noise generated by a noisy resistor, in series with an equivalent ideal (noise-free) resistor.
A Thevenin equivalent circuit can be used to model the thermal noise generated by a receiver
The voltage source produces an open circuit rms noise voltage V across its "internal" resistance in series with the ideal resistor, both of which have resistance R. The resulting closed-circuit rms current I would therefore be given as:
I = | V |
2R |
The noise power N would thus be calculated as:
N = I^{ 2}R = | V^{ 2} | R = | V^{ 2} |
(2R)^{ 2} | 4R |
Therefore:
k_{ B}TB = | V^{ 2} |
4R |
We can now rearrange the equation to get:
V^{ 2} = 4k_{ B}TRB
The rms noise voltage V_{ n} can thus be written as:
V_{ n} = √4k_{ B}TRB
As we can see from the above, the thermal noise voltage is dependent on temperature, resistance and bandwidth. The thermal noise power, on the other hand, is only dependent on temperature and bandwidth. In either case, we can reduce the level of the thermal noise by reducing temperature or bandwidth (or both). In terms of reducing temperature, this is not really practical in most circumstances. Reducing bandwidth is possible, and relatively easy to achieve, through the use of suitable filters.
Let's calculate the values for noise power (in dBm) and rms noise voltage (in microvolts) for a typical scenario. We'll assume a temperature of 293.15 K (i.e. 20° Celsius, which is within the range typically considered to be room temperature), a resistance of 50 ohms, and a bandwidth of 1 megahertz. Here are the calculations:
N = 10 × log_{ 10} k_{ B}TB
N = 10 × log_{ 10} (1.38064852 × 10^{ -23} × 293.15 K × 10^{ 6} Hz × 1000 mW)
N = -113.928 dBm
V_{ n} = √4k_{ B}TRB
V_{ n} = √4 × 1.38064852 × 10^{ -23} × 293.15 K × 50 Ω × 10^{ 6} Hz
V_{ n} = 0.899708 μV
You can use the thermal noise calculator below to calculate thermal noise power and rms voltage for various scenarios.
Enter the temperature, resistance and bandwidth values: | ||
---|---|---|
°C | ||
Ω | ||
Hz | ||
μV RMS | ||
dBm | ||
Thermal noise distribution
Although the way in which we have modelled thermal noise seems to imply that we are dealing with an AC voltage, the reality is that it is totally random in nature, and its waveform is impossible to predict. This means that we can't eliminate or reduce the effects of such noise using techniques such as cancellation. That said, for a finite bandwidth, the thermal noise will have an amplitude distribution that is (approximately) Gaussian.
Putting this another way, the noise voltage will vary according to a Gaussian probability density function (PDF). In theory, a PDF takes the set of possible values for a random variable, and for each of those values outputs the probability of that value occurring at any given time. However, since the variable could take any one of an infinite number of possible values is, it is probably more accurate to say that the function outputs the probability of the random variable falling within a given range of values.
A Gaussian PDF for thermal noise voltage produces a normal distribution curve, i.e. a bell-shaped curve as shown in the illustration below. The curve represents the statistical probability of the noise voltage falling within a particular range of values at any given time. The highest point of the curve lies at the centre, and indicates the value with the highest probability of occurring, which for a normal distribution will also be the mean (or average) value. In the case of our thermal noise voltage, this will be zero.
Thermal noise voltage varies according to a Gaussian probability density function
Note that, in a normal distribution, the mean value is also the median (the value that falls in the middle of the range of values), and the mode (the value that appears most often in the range). As we can see from the curve, the probability of the noise voltage taking a positive or a negative value is the same. We can also see that the probability falls off symmetrically on either side of the centre as the magnitude of the voltage increases, and appears to flatten out as it approaches the bottom of the curve (although in theory, it never reaches a probability of zero on either side).
A standard normal distribution has a mean of zero, and a standard deviation of one. The mean identifies the position of the centre of the curve, and the standard deviation determines its height and width. The standard deviation is the square root of the variance of whatever variable we are looking at, and is represented by the Greek lower-case letter σ (sigma). In this case, the standard deviation is the rms noise voltage. Within a standard normal distribution, 68% of values fall within ± one standard deviation from the mean, 95% fall within ± two standard deviations (2σ), and 99.73% fall within ± three standard deviations (3σ).
The signal to noise ratio
The signal-to-noise ratio (SNR) of a signal is, as the name suggests, a measure used to compare the level of a signal to the level of background noise; it is the ratio of signal power to noise power. Communications engineers attempt to maximise the SNR by optimising the design of transmitter and receiver hardware to achieve maximum performance. Typical measures include boosting transmitter output and using of filters at the receiver to eliminate out-of-band noise. The SNR is usually expressed in decibels:
SNR_{ dB} = 10 log_{ 10} | P_{ signal} | = 10 log_{ 10} P_{ signal} - 10 log_{ 10} P_{ noise} |
P_{ noise} |
where P_{ signal} is the average signal power and P_{ noise} is the average noise power.
An SNR greater than 0 dB means that signal power is greater than noise power. In order to accurately determine an SNR, both the signal and the noise should be measured at the same point in a system, or at equivalent points in the system having the same bandwidth. The signal and the noise should also be measured in the same way. For example, if the noise voltage is measured, it should be taken over the same impedance. If we use voltage to determine the SNR rather than power, we use a different definition:
SNR_{ dB} = 10 log_{ 10} | V_{ signal}^{ 2} |
V_{ noise}^{ 2} |
SNR_{ dB} = 20 log_{ 10} | V_{ signal} | = 20 log_{ 10} V_{ signal} - 20 log_{ 10} V_{ noise} |
V_{ noise} |
where V_{ signal} is the average signal rms voltage and V_{ noise} is the average noise rms voltage.
The SNR of a communication system gives us an indication of how well a signal carrying useful information stands out against the system's inherent background noise. The higher the ratio, the easier it is for the receiver to detect and correctly interpret the incoming signal. A good SNR puts the signal well above the noise floor. If the incoming signal is weak, as tends to be the case with most forms of wireless communication, simply amplifying the signal will not help because the underlying noise will also be amplified.
In general, a signal cannot be detected if the signal strength is at or below the noise floor. The notable exceptions are signals generated using spread-spectrum techniques, which transmit the signals across a broad range of frequencies according to a pseudo-random coded sequence. Spread spectrum signalling techniques were originally developed for military communications systems in order to render the signals less vulnerable to interception or jamming.
The spreading process reduces the spectral density of the signal to make it look like noise. The actual signal levels can be below the noise floor, but because the background noise is essentially white noise, they can be detected by a receiver that knows the spreading sequence, and thus knows which frequency to listen to at any given moment. To anybody else listening in, the signal just looks like small random fluctuations in the background noise.
For other kinds of signal, if the level is too close to the noise floor some of the information carried by the signal will be lost. Each frame or packet that cannot be read by the receiver because it has been too badly corrupted by noise results in a request for retransmission. The average number of requests for retransmission per unit time is known as the retry rate. A high retry rate equates to low data throughput.
As a general rule, regardless of the type of network we are talking about, a good SNR is essential in order to achieve acceptable data rates. For wireless networks, the general consensus seems to be that an SNR of around 20 dB is adequate for applications such as email, data transfer and surfing the web. For more demanding applications such as video conferencing or watching streaming videos, an SNR of at least 25 dB, and preferably higher, is recommended.
The Shannon limit
When we talk about channel capacity, we are usually referring to the maximum amount of information that can be reliably transmitted over a communication channel per unit time. As we saw above, Harry Nyquist established the Nyquist rate (the maximum number of signal symbols that could be sent over a channel) as being equal to twice the bandwidth of the channel in hertz. Ralph Hartley took things a step further, deriving a maximum data rate R based on the Nyquist rate 2B and the number of unique signal symbols M that could be generated:
R ≤ 2B log_{ 2} (M)
As we pointed out earlier, however, the formula does not take transmission impairments such as noise into account. This was left to the American mathematician and electrical engineer Claud Shannon (1916-2001), who is probably best remembered for establishing information theory as a field of study in its own right following the publication of his landmark paper "A Mathematical Theory of Communication" in 1948, which was in part based on the work of Nyquist and Hartley.
Among other things, Shannon's paper outlines what is usually referred to as the noisy-channel coding theorem. This states that, for a noisy communication channel with channel capacity C, over which information is transmitted at a rate R, then providing R < C, the use of suitable error-correction codes can make the probability of error at the receiver arbitrarily small.
The channel capacity C, sometimes called the Shannon limit, is determined according to the Shannon-Hartley theorem, and is the maximum rate at which information can be transmitted over a channel of a given bandwidth B in the presence of noise, assuming that the degree to which the channel is affected by noise is known. For discrete (digital) data, the channel capacity C represents the maximum achievable bit-rate (in bits-per-second), and is calculated as follows:
C = B log_{ 2} | ( | 1 + | S | ) |
N |
A detailed analysis of how Shannon arrived at this result is beyond the scope of this article. Suffice it to say that his approach was rigorous and highly mathematical. Nevertheless, for those interested, a downloadable version of Shannon's 1948 paper can be found here.
It may occur to the reader at this point that the above formula looks somewhat similar to Hartley's formula defining the maximum data rate R of a noiseless channel. In fact, we can re-write the formula as follows:
C = 2B log_{ 2} | ( | 1 + | S | ) | ^{ 1/2} |
N |
From the above, it should be apparent to the reader that there is an implied relationship between the number of unique symbols M in Hartley's formula and the signal-to-noise ratio in Shannon's formula:
M = √1 + S/N
The implication here is that, since the bandwidth of a channel is fixed, the number of unique signals it is possible to transmit reliably will depend in no small measure on the signal-to-noise ratio. Let's investigate a real-world example. Consider a modern dial-up modem (this sounds like a contradiction in terms in the age of high-speed Internet access, but there are still a handful of ISPs that offer dial-up Internet access). The most advanced dial-up modems have a theoretical maximum downstream data transfer rate of 56 kbps (56,000 bits per second). How is this figure arrived at?
The bandwidth of an analogue telephone line is 4000 hertz (4 kHz). Applying the Nyquist rate would thus give us a baud rate of 8000 baud. The most advanced dial-up modems are capable of transmitting 7 bits per baud, which requires a value for M of 128 (in other words, we need 128 unique signal symbols):
M = 2^{ 7} = 128
According to Hartley's formula, the theoretical maximum date rate R (in bits per second) would be calculated as follows:
R = 2B log_{ 2} M = 2 × 4000 × log_{ 2} 128 = 8000 × 7 = 56,000
This theoretical maximum is never achieved in practice because analogue telephone lines, like pretty much every other type of communication channel, are subject to noise. Just how close we can get to the theoretical maximum will depend on how much noise there is in relation to the strength of the information-carrying signal. In other words, it depends on the signal-to-noise ratio. A typical analogue telephone line has a signal-to-noise ratio of 30 dB, so let's, as they say, "do the math" - this time according to Shannon.
Bear in mind that, in Shannon's formula, the signal component S and the noise component N of the signal-to-noise ratio (S/N) are expressed using either the average power in watts or the square of the average rms voltage. Because the signal-to-noise ratio is usually expressed in decibels, we need to perform the necessary conversion to get the value of S/N. In this case, we have:
30 dB = 10^{ 30/10} = 10^{ 3} = 1000
The channel capacity, C, of our typical analogue phone line can now be calculated as follows:
C = B log_{ 2} | ( | 1 + | S | ) | = 4000 × log_{ 2} 1001 = 39,869 bps |
N |
Clearly, even with a signal to noise ratio of 1001:1 we are going to be some way short of the theoretical maximum data transfer rate of 56 kbps. What signal-to-noise ratio would give us this value? Suppose we do this:
log_{ 2}SNR = | 56,000 | = 14 |
4,000 |
SNR = 2^{ 14} = 16,384
In order to achieve the maximum theoretical data transfer rate of 56 kbps, therefore, we would need to have a signal to noise ratio of 16,384:1. In decibels, this would be:
10 log_{ 10} 16,384 = 42.144 dB
The reality is that, even over relatively short subscriber loops, the SNR of an analogue telephone line is unlikely to get anywhere near this figure. An extra 3 dB would almost double the value of the SNR (33 dB ≈ 1,995), but it would not double the data transfer rate. At best, it would give us just under 4,000 bits per second more.
Bear in mind also that Shannon's limit is a theoretical maximum. All channels are subject to some degree of noise, which means that we cannot guarantee that the transmitted data will be error free. Transmitting data at or below the maximum channel capacity will reduce the probability of error to an arbitrarily low value, certainly, but errors will occur.
In order to be able to correct the errors that occur, we must encode the transmitted data with error detection codes that enable the receiver to detect any errors and request re-transmission of the affected data, or forward error correction (FEC) codes that enable the receiver to both detect and correct any errors without having to request re-transmission.
Both error detection and forward error correction codes add a significant amount of overhead to the transmitted data and reduce the net data rate achievable. That said, the availability of increased processing power in recent years has facilitated the development of highly efficient codes, which in turn make it possible to achieve the transfer of useful data at rates that are very close to the Shannon limit.