Memory Management
In order for programs to be executed by the processor, they must be loaded into main memory, which is several orders of magnitude faster than secondary storage in terms of its access time. Historically, main memory has always been significantly more expensive than secondary storage, and system designers have tried to optimise its use. Although costs have fallen steadily, this is still the case today, and modern operating systems and applications require large amounts of memory in order to run.
The memory management component of an operating system is concerned with the organisation and management of system memory. It determines how memory is allocated to processes, responds to constantly changing demands, and interacts with memory management hardware (if present) to maximise efficiency.
Overlays
One of the main limitations imposed on programmers in the early days of computing was the size of the computer's memory. If the program was larger than the available memory, it could not be loaded, which placed severe restrictions on program size. The obvious solution would be to increase the amount of memory available, but this would significantly increase the cost of the computer system.
One way for a programmer to overcome these limitations was to use overlays. The programmer divided the program into a number of logical sections. A small portion of the program had to remain in memory at all times, but the remaining sections (or overlays) were loaded only when they were needed. The use of overlays allowed programmers to write programs that were much larger than physical memory, although responsibility for managing memory usage rested with the programmer rather than the operating system.
A program is written using overlays to overcome memory limitations
Partitions and protected memory
Another issue that must be dealt with in terms of memory management is that of memory protection. In a single-user, single-tasking system, this is relatively easy, since it is simply a matter of protecting the operating system's memory space from the user's program. Memory is effectively split into two partitions – one partition is reserved for the operating system, while the other is available for user programs.
Protection can be implemented using a boundary register built into the processor. The boundary register contains the base memory address of the user program, effectively defining the lower bounds of the memory space available to user processes. Each time a user process references a memory address, the system checks the referenced address against the value stored in the boundary register. If the user process has attempted to reference a memory location with a lower address, the process is terminated with an appropriate error message.
Operating system memory is protected
In a multi-programming environment, matters are inevitably going to be far more complicated. In early multi-programming systems, the memory management function had to protect the operating system memory space from all user processes loaded into memory as before, but it now also had to protect those user processes from each other.
These early multi-programming systems used fixed partitions to allocate memory to user processes, each of which could hold a single process. Each partition was defined by two boundary registers – a base register and a limit register for the for the lower and upper memory addresses respectively. Although more complex than systems in which only a single user process was allowed to run, memory management was still relatively simple because each process ran within a pre-defined partition.
A multi-programming system with fixed partitions
Internal fragmentation
The main problem with the fixed-partition system was that each partition could accommodate only a single process. If the process did not use all of the memory within a partition, the remaining memory could not be used for any other process. The loss of available memory within a partition due to the fact that the process allocated to the partition may be significantly smaller than the partition in which it executes is known as internal fragmentation. Often, there was enough unused system memory to run another process, but there were no available partitions. This meant that the unused memory could not be utilised, and was plainly an inefficient use of memory resources.
Processes are queued awaiting an available partition
Variable partitions
The main problem with fixed partitions is that they are rarely if ever exactly the right size for a particular process. If the partition is too small, a process cannot be loaded into it and must await the availability of a large partition. If the partition is too large, the process may be loaded but significant amounts of the available memory within the partition will be wasted.
The idea of a variable-size partition allows the operating system to create partitions of exactly the right size for a particular process wherever there is sufficient space available. Internal fragmentation is thus eliminated, but a new problem emerges. Processes are initially added to the system using contiguous memory space. As each process completes, however, it is removed from the system leaving a hole in memory. This hole can be used for a new process, providing it will fit into the space made available. Since it highly unlikely that any new processes will fit exactly into the gaps left by the old processes, the only processes able to use these gaps in memory will be smaller than their predecessors.
Processes in variable partitions can leave memory holes
The operating system must also determine where to place incoming programs when there may be a number of holes large enough to accommodate the newly created process. There are three memory placement strategies that have been commonly used for this purpose:
- first-fit – the process is placed in first available hole large enough to hold it, allowing the operating system to make a decision quickly
- best-fit – the process is placed in the hole that fits it most closely, leaving little space unused but creating additional management overhead (searching for the best fit) and leaving small holes that cannot be used by other processes
- worst-fit – the process is placed in the largest possible hole regardless of size, potentially leaving enough space for another program but again creating additional management overhead (searching for the worst fit) and potentially leaving smaller holes that cannot be used by other processes
Over time and regardless of the memory placement strategy used, a number of small holes will be created in memory that cannot be utilised by new processes because they are simply too small. This is referred to as external fragmentation. In order to counter this, operating systems evolved the ability to relocate partitions so that the holes were effectively moved to the top of memory and merged into one much larger unused memory space. This procedure is sometimes called compaction, and while it enables memory to be used far more efficiently, it increases yet again the complexity of the memory management function. The additional system overhead required to implement a compaction scheme is considerable.
Memory compaction
In early single-tasking operating systems, only one process could reside in memory at any one time. This did not allow for particularly efficient use of the processor, since each time execution of the resident process was halted awaiting the completion of some I/O operation such as reading from or writing to disk, the processor was effectively idle.
As the amount of available memory increased it became possible to load multiple processes into memory simultaneously, enabling the processor to be quickly switched between them to provide the illusion that a number of processes were executing at the same time. It also enabled the processor to be immediately allocated to a waiting process in the event of the current process being halted to await the completion of an I/O request.
In multitasking systems, numerous processes now compete for the available memory, and one or more of the resident processes may have to give up memory to accommodate an incoming process with a higher priority. A process that relinquishes its memory before it has completed in order to make way for another process is temporarily stored on disk (in what is often referred to as virtual memory) until system memory is once more available (or until it achieves a sufficiently high priority level to enable it to replace another process currently stored in main memory).
The movement of processes back and forth between system memory and secondary storage in this way is called "swapping", and may occur several times during the execution of a single process. Inevitably, swapping incurs additional operating system overhead and adds complexity to the business of managing memory. It should also be fairly self-evident that the degree to which swapping must occur can be reduced by increasing the amount of memory available.
Paging and segmentation
The need to load an entire program into memory in one contiguous block places severe restrictions on memory usage. The use of overlays overcame the limitations imposed by the physical size of memory, but in order to make efficient use of the processor, several processes needed to be in memory at one time.
As physical memory became larger this was certainly possible, but the memory space required by each process could vary widely, and for the operating system, the main problem was trying to find the most suitable memory space for a particular process. Inevitably, some of the memory could not be used, and in many cases a process would be too big to fit into any of the available spaces.
One answer was to break each process down into a number of relatively small fixed-size blocks, called pages, which could be loaded into memory as and when required. The memory itself was also organised into blocks of the same size, called page-frames. Each page could be loaded into any available page frame, which meant that the entire available memory space could be used for user processes, and allowed far more processes to be in memory at the same time. This was a far more efficient use of computer memory, but also imposed a much greater burden on the operating system in terms of memory management overhead, since it must now keep track of multiple pages for each user process.
One of the benefits of a paging system is that only a relatively small subset of the pages belonging to a particular process need to be loaded at any one time. In fact theoretically, only the page that is currently executing needs to reside in memory. When a process references a page that is not in main memory, it generates a page fault, which prompts the operating system to load the missing page into memory from secondary storage.
The frequency with which this occurs will depend on the size of the page, and the total number of pages which make up a process. In practice, there will be a minimum number of pages that should reside in memory at any given time for a particular process. This number is known as the minimum working set, and will be determined to a large extent by the overall size of the user program.
Another method of utilising memory, called segmentation, is also based on the notion of splitting a process into blocks, but the size of the blocks (or segments) is variable. A scheme that allows the use of variable size segments can be useful from a programmer's point of view, since it lends itself to the creation of modular programs, but the operating system now not only has to keep track of the starting address of each segment, but since they are variable in size, must also calculate the offset to the end of each segment. Some systems combine paging and segmentation by implementing segments as variable-size blocks composed of fixed-size pages.
Virtual memory
The concept of virtual memory is closely associated with that of paging. As we have mentioned previously, virtual memory effectively extends the amount of memory available to applications by using some of the system's secondary storage space (i.e. the hard disk drive) as working memory. All or part of a process may be swapped between real memory and this virtual memory. This both allows more processes to co-exist on the system and eliminates the limitations on overall process size.
The only factor that limits process size, in fact, is the amount of secondary storage that is made available for virtual memory. In a sense, virtual memory systems create the illusion that there is more working memory available than is actually the case.
The use of a paging scheme, in conjunction with the use of virtual memory, leads to a situation where the individual pages of a process may be located in a number of different locations in both real memory and virtual memory. In fact, during the course of process execution, an individual page may be swapped between real and virtual memory a number of times, and is unlikely to occupy the same address twice.
This presents the operating system with a seemingly monumental task in terms of keeping track of the whereabouts of each page, for each process on the system. Furthermore, the process of swapping pages in and out of memory creates additional system overhead. As previously mentioned, it is generally accepted that the more random access memory (RAM) a system has the better it will perform, since less time will be spent swapping pages between real and virtual memory.
In a system that employs virtual memory, the size of the pages can have an effect on system efficiency. If the pages are small, the operating system will require more tracking information for each process because there will be more pages. If, on the other hand, the size of the pages is large, it will take longer to transfer each page between real memory and virtual memory, and vice versa, when swapping occurs.
Memory mapping
In a paged virtual memory system, there are actually two different kinds of address. A program instruction within a process, for example, will have an internal (or virtual) address that will be used to reference it from within the process. A virtual address will consist of a page number and an offset from the start of the page.
When the process is loaded into memory, however, the same program instruction will also have a physical address in memory. When a virtual address is referenced by a process, therefore, the operating system must translate this virtual address to a physical address. This is further complicated by the fact that the various pages of a process may be loaded into non-contiguous page frames, and may be swapped in and out of memory several times during the lifetime of the process.
A virtual address consists of the page number and offset
The operating system creates a page table in memory for each process. The table contains a page table entry for each page of the process, and the entries are ordered sequentially by page number. When a process is assigned to the processor, the address of the page table in memory is loaded into a special purpose register called the page map origin register.
Each time the process references a virtual address, the system interrogates this register to obtain the base address of the page table, and adds the page number to give it the address of the page table entry. The entry will hold the starting address in memory of the page. Adding the offset will provide the required physical address.
The diagram below illustrates how a virtual address referenced by a process is translated into a physical address in memory. The mapping is performed dynamically by a high-speed, special-purpose hardware unit called the memory management unit (MMU).
Virtual address translation (memory mapping)
Note that only some of the pages belonging to a particular process will be loaded into memory at any given time. The page table entry for each page must therefore indicate whether or not the page currently resides in memory. This is achieved using a 1-bit field in the entry called the resident bit. If the resident bit is set to 1, the entry gives the page frame number. If it is set to 0, the entry provides the location of the page in secondary storage.
Page replacement policies
When a new page needs to be brought into memory, it will often be necessary to remove one that is currently resident. The algorithms used to determine which pages will be replaced are called page replacement policies. Some of the policies that may be used include:
- Least recently used (LRU) – this policy replaces whichever page has remained unreferenced for the greatest period of time. For this to work, each page needs a time stamp that is updated whenever the page is referenced, and the system has to search through all of the timestamp entries to find the oldest value each time it needs to replace a page. An alternative is to maintain a linked list of pages, in which a referenced page is moved to the front of the list, although this is equally time consuming.
- Not Recently Used (NRU) – this policy works almost as well as LRU but involves far less overhead. Each page frame has a page referenced bit associated with it, which is set to 1 if the page is referenced (the operating system periodically resets all of the page referenced bits to zero). Any page with a page referenced bit of zero is eligible for replacement.
- First-In-First-Out (FIFO) – this policy simply replaces the page which has been in memory for the longest time, on the basis that the page is unlikely to still be in use. A chronologically ordered linked list of pages is used to determine which page has been in memory the longest. The page at the front of the list is removed, and the new page is added to the end of the list. This method may result in poor performance, however, since some of the pages removed may be in frequent use, and their removal from memory will result in additional page faults being generated.
- Clock – this policy is a variation of FIFO with the difference that a circular linked list is used, and each entry in the list has a page referenced bit that is initially set to 0, and is set to 1 each time the page is referenced. A pointer indicates the current head of the list, and when a page replacement is required the pointer moves around the list until a page frame with a page referenced bit of 0 is found (entries with a page referenced bit of 1 have it set to 0). If all page referenced bits are set to 1, the pointer will return to its starting point (which will have a page referenced bit of 0, since the pointer has already been here once). This is similar to FIFO, except that a recently used page will not be replaced.