
'Intel' Pentium 4 (Northwood version), one example out of a huge number of x86 implementations from Intel, AMD, and others.

'AMD' Athlon (early version), another, technically different, but fully compatible x86 implementation.
The generic term 'x86' refers to the "
CISC" type
instruction set (architecture) of the most commercially successful
CPUs (unlike "
microarchitecture" referring to ''CPU's'' layout) in the history of
personal computing, used in processors from
Intel,
AMD,
VIA, and others. It derived from the model numbers of the first few generations of
CPUs,
binary compatible backward with the ''Intel's'' original 16-bit
8086 of 1978, most of which were ending in "86"
[1].
After the introduction of
80386 in 1985, the ''x86'' term also implied, in practice, a binary compatibility with the ''80386's'' extended 32-bit
instruction set – sometimes emphasized as '
x86-32' to distinguish it either from the original 16-bit x86-16 or from the newer 64-bit '
x86-64' ''instruction sets''. The ''x86'' term usually implies the 32-bit ''x86-32'' ''instruction set'', while the ''x86-64'' term (used especially in reference to 64-bit processors
[2]) is generally substituted by the 'x64' name (exclusive for 64-bit software
[3]) at least in ''personal computing'' and
servers
[4].
The only significant competitors to x86 in
PCs were the
Motorola 68k,
CISC type, and the
PowerPC,
RISC type,
instruction sets. However, by
August 7,
2006,
Apple Inc. switched to x86
CPUs granting the x86 instruction set an effective monopoly among desktop and notebook processors. The x86 also held a growing majority among servers and workstations. Markets without a significant x86 presence include low cost embedded processors found in
appliances and toys, among others.
[5]
Countless
computer software is written for the ''x86'' platform – including nearly all modern commercial
operating systems from
MS-DOS and
Microsoft Windows to
Linux,
BSD,
Solaris OS, and
Mac OS X – making the ''x86
instruction set architecture'' indispensable on a global scale, and practically irreplaceable.
Chronology
The table below lists brands of famous
[6] x86 (
instruction set) consumer targeted processors grouped by generations. Note: ''A definition of CPU generation is not strict. Each generation is roughly marked by significantly improved and commercially successful processor
microarchitecture designs.''
| Generation | Introduction | Prominent CPU brands | Addressing | Notable features |
|---|
| 1 (IA-16) | 1978 | Intel 8086, Intel 8088, Intel 80186, NEC V20 | 16-bit | First x86 microprocessors |
| 2 | 1982 | Intel 80286 | 16-bit | built-in MMU |
| 3 (IA-32) | 1985 | Intel386, AMD Am386 | 32-bit | IA-32 instruction set, MMU with paging |
| 4 | 1989 | Intel486 | 32-bit | Instruction pipeline, integrated FPU, integrated cache |
| 5 | 1993 | Pentium, AMD K5, AMD K6 | 32-bit | Superscalar, 64-bit bus, MMX |
| 6 | 1995 | Pentium Pro, Pentium II, AMD K6-2, Cyrix 6x86, Pentium III | 32-bit | RISC core, L2 cache, superpipelining, SSE |
| 6-M | 2003 | Pentium M | 32-bit | low power, |
| 7 (IA-32, X86-64) | 1999 | Athlon, Athlon XP, Pentium 4, Pentium D | 32-bit, 64-bit | SSE2, SSE3, Hyper-Threading |
| 7-M | 2006 | Intel Core | 32-bit | dual-core |
| 8 (X86-64) | 2003 | Athlon 64, Intel Core 2, AMD K10 | 64-bit | x85-64 instruction set, multi-core |
History
The x86 architecture first appeared in the
Intel 8086 CPU released in 1978, as a fully 16-bit design based on the earlier
instruction set of the
Intel 8085. Although not
binary compatible with the ''8085'', the ''8086'' was designed to allow
assembly language programs written for the ''8085'' be mechanically translated into the equivalent ''8086'' ''assembly''. This made the ''8086'' a tempting migration path aim for the ''8085'' hardware and software vendors, but – due to the 16-bit databus – not without significant redesign of the ''8085'' system hardware. To reduce such a redesign need, Intel introduced the
8088, which external 8-bit
databus more easily interfaced to already established, and therefore low-cost, 8-bit system and peripheral chips. This – and other, non technical factors – encouraged
IBM to build their
IBM PC around the ''8088'', despite a presence (at the time) of technically superior competitors, like the
Motorola 68000. Subsequently, the ''IBM PC'' became a dominant personal computer platform, and the ''8088'' (''8086'') and its successors became a dominant ''CPU'' for desktop and laptop computers making their (named later as ''x86'') ''instruction set'' architecture dominant as well.
At various times, companies such as
IBM,
NEC,
AMD,
TI,
STM,
Fujitsu,
OKI,
Siemens,
Cyrix,
Intersil,
C&T,
NexGen, and
UMC started to design and/or manufacture
processors, which implemented the x86 ''instruction set'' architecture (but in varying ''CPU'' hardware designs, called "
microarchitectures", and so-called "
compatible" with the original) and were intended for personal computers as well as embedded systems. For the personal computer market, real quantities started to appear around 1990 with 386 and 486 ''compatible'' processors, often named similarly to Intel's original chips. Other companies, which designed or manufactured x86 or
x87 processors, include
ITT Corporation,
National Semiconductor,
ULSI Systems, and
Weitek.
Following the fully
pipelined i486,
Intel introduced the
Pentium brand name (which could be trademarked, unlike numbers) for their new line of
superscalar ''x86'' designs. With the 80x86 naming scheme now legally cleared,
IBM partnered with
Cyrix to produce the
5x86 and then the very efficient
6x86 (M1) and
6x86MX (
MII) lines of
Cyrix designs, which were the first ''x86'' (''instruction set architecture'') chips implementing
register renaming to enable
speculative execution. AMD meanwhile designed and manufactured the advanced but delayed
5k86 (
K5), heavily based on their earlier
29K RISC type (hardware) ''microarchitecture''. Like
NexGen's
Nx586, it used a strategy where dedicated pipeline stages decode ''x86'' instructions into uniform and easily handled
micro-operations, a method that has remained standard to this day.
Some early versions of these competitors' chips had heat dissipation problems. The 6x86 was also affected by a few minor compatibility issues, and the
Nx586 lacked an
FPU as well as (the then crucial) pin-compatibility, while the
K5 had somewhat disappointing performance when it was (eventually) launched. A low customer awareness of alternatives to the Pentium line further contributed to these designs being comparatively unsuccessful, despite the fact that the
K5 had very good Pentium compatibility and the
6x86 was significantly faster than the Pentium on integer code.
[7] On the other hand,
AMD later established itself as a serious contender with the
K6 line of processors, which gave way to the highly successful
Athlon and
Opteron. There were also other contenders, such as
Centaur Technology, (
IDT),
Rise Technology, and
Transmeta.
VIA Technologies' energy efficient
C3 and
C7 processors were designed by
Centaur and are in full production today.
The architecture has twice been extended to a larger
word size. In 1985, Intel released the 32-bit 386 to gradually replace the earlier 16-bit chips (which were sold for many more years). This extension to the architecture is sometimes called x86-32 to differentiate it from the original "x86-16" or the newer
x86-64 extension. However, it was originally referred to as i386 by Intel (and others) and later renamed
IA-32 (for 'I'ntel 'A'rchitecture-'32'-bit) when Intel unveiled its unrelated 64-bit
Itanium architecture, referred to as
IA-64. In 1999-2003,
AMD further extended the architecture to 64 bits, originally called
x86-64 in AMD documents, but now
AMD64. Intel soon adopted AMD's architectual extensions under the name
IA-32e which was later renamed
EM64T and finally
Intel 64 (no to be confused with the unrelated
IA-64 architecture).
Microsoft and
Sun Microsystems have used their own vendor-neutral
x64 for this same
x86-64 architecture.
Design
Technical overview
The x86 architecture is a variable instruction length, primarily two-address, "
CISC" design with emphasis on
backward compatibility. The instruction set is not typical CISC however, but basically an extended and orthogonalized version of the simple eight-bit
8085 architecture. Words are stored in
little-endian order and 16-bit and 32-bit accesses are allowed to unaligned memory addresses. To conserve opcode space, most register-addresses are three bits, and at most one operand can be in memory (in contrast with some highly orthogonal
CISC designs such as PDP-11 where both operands can be in memory), but this memory operand may also be the ''destination'', while the other operand, the ''source'', can be either ''register'' or ''immediate''. This contributes, among other factors, to a code footprint that rivals 8-bit machines and enables efficient use of instruction cache memory. During
execution, current x86 processors employ a few extra decoding steps to split most instructions into smaller pieces, micro-ops, which are readily executed by a
micro-architecture that could be (simplistically) described as a
RISC-machine without the usual load/store limitations. The small number of general registers (also inherited from 8085) has made register-relative addressing (using small immediate offsets) an important method of accessing operands, especially on the stack. Much work has therefore been invested in making such accesses as fast as register accesses, i.e. a one cycle instruction throughput in most circumstances.
Segmentation
Minicomputers during the late 1970s were running up against the 16-bit 64-
KB address limit, as memory had became cheaper. Most such companies therefore redesigned their processors to directly handle 32-bit addressing and data. The original 8086, developed from the simple
8085 microprocessor and primarily aiming at another market, instead adopted a much-criticized concept of segment registers which raised the memory address limit by only 4 bits, to 20 bits (1
megabyte).
Data and/or code could be managed within "near" 16-bit segments within this 1
MB address space, or a compiler could operate in a "far" mode using 32-bit
segment:offset pairs reaching (only) 1 MB. While that would also prove to be quite limiting by the mid-1980s, it was working for the emerging PC market, and made it very simple to translate software from the older
8080,
8085, and
Z80 to the newer processor. Seven years later, in 1985, this cumbersome addressing model was effectively factored out by the introduction of 32-bit offset registers, in the
386 design.
The original 8086 and 8088
The original Intel
8086 and
8088 have fourteen 16-
bit registers. Four of them (AX, BX, CX, DX) are general registers (although each have an additional purpose; for example only CX can be used as a counter with the ''loop'' instruction). Each can be accessed as two separate bytes (thus BX's high byte can be accessed as BH and low byte as BL). Four segment registers (CS, DS, SS and ES) are used to form a memory address. There are two pointer registers. SP points to the bottom of the stack and BP which is used to point at some other place in the stack or the memory(Offset). Two registers (SI and DI) are for array indexing.The
FLAGS register contains
flags such as
carry flag,
overflow flag and
zero flag. Finally, the instruction pointer (IP) points to the current instruction.
The 8086 has 64 KB of 8-bit (or alternatively 32 K-word of 16-bit)
I/O space, and a 64 KB (one segment)
stack in memory supported by
hardware. Only words (2 bytes) can be pushed to the stack. The stack grows downwards (toward numerically lower addresses), its bottom being pointed by SS:SP. There are 256
interrupts, which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the
return address.
Real mode
Main articles: Real mode
Real mode is an operating mode of
80286 and later
x86-compatible
CPUs. Real mode is characterized by a 20 bit segmented memory address space (meaning that only 1
MB of memory can be addressed), direct software access to
BIOS routines and peripheral hardware, and no concept of
memory protection or
multitasking at the hardware level. All x86 CPUs in the
80286 series and later start up in real mode at power-on;
80186 CPUs and earlier had only one operational mode, which is equivalent to real mode in later chips.
In real mode, memory access is ''segmented''. This is done by shifting the segment address left by 4 bits and adding an offset in order to receive a final 20-bit address. For example, if DS is A000h and SI is 5677h, DS:SI will point at the absolute address DS × 16 + SI = A5677h. Thus the total address space in real mode is 2
20 bytes, or 1
MB, quite an impressive figure for 1978. All memory addresses consist of both a segment and offset; every type of access (code, data, or stack) has a default segment register associated with it (for data the register is usually DS, for code it is CS, and for stack it is SS). For data accesses, the segment register can be explicitly specified (using a segment override prefix) to use any of the four segment registers.
In this scheme, two different segment/offset pairs can point at a single absolute location. Thus, if DS is A111h and SI is 4567h, DS:SI will point at the same A5677h as above. This scheme makes it impossible to use more than four segments at once. CS and SS are vital for the correct functioning of the program, so that only DS and ES can be used to point to data segments outside the program (or, more precisely, outside the currently-executing segment of the program) or the stack. This scheme was intended as a compatibility measure with the
Intel 8085.
The segmented nature can make programming and compilers design difficult because the use of near and far pointers affect performance. The introduction of bank switching schemes such as EEMS made programming even more complicated before the adoption of 32 bit addressing methods with later processors.
16-bit protected mode
Main articles: Protected mode
In addition to real mode, the Intel 80286 supports protected mode, expanding addressable
physical memory to 16
MB and addressable
virtual memory to 1
GB. This is done by using the segment registers only for storing an index to a segment table. There were two such tables, the
Global Descriptor Table (GDT) and the
Local Descriptor Table (LDT), each holding up to 8192 segment descriptors, each segment giving access to 64 KB of memory. The segment table provided a 24-bit
base address, which can be added to the desired offset to create an absolute address. Each segment can be assigned one of four
ring levels used for hardware-based
computer security.
Because real mode
DOS programs may do direct hardware access or perform segment arithmetic, both incompatible with protected mode, an
operating system (OS) is limited in its ability to run these applications as
processes. To overcome these difficulties, Intel introduced the 80386 with
virtual 8086 mode. While still subject to paging, it uses real mode to form linear addresses and allows the OS to
trap both I/O and memory access. By design, protected mode programs do not assume a relation between selector values and physical addresses.
Operating systems like
OS/2 1.x try to switch the processor between protected and real modes. This is both slow and unsafe, because a real mode program can easily
crash a computer. OS/2 1.x defines restrictive programming rules allowing a ''Family API'' or ''bound'' program to run in either real or protected mode.
Windows 3.0 should run real mode programs in 16-bit protected mode.
Windows 3.0, when transitioning to protected mode, decided to preserve the single privilege level model that was used in real mode, which is why Windows applications and DLLs can hook interrupts and do direct hardware access. That lasted through the
Windows 9x series. If a Windows 1.x or 2.x program is written properly and avoids segment arithmetic, it will run the same way in both real and protected modes. Windows programs generally avoid segment arithmetic because Windows implements a software virtual memory scheme, moving program code and data in memory when programs are not running, so manipulating absolute addresses is dangerous; programs should only keep
handles to memory blocks when not running. Starting an old program while Windows 3.0 is running in protected mode triggers a warning dialog, suggesting to either run Windows in real mode or to obtain an updated version of the application. Updating well-behaved programs using the MARK utility with the MEMORY parameter avoids this dialog. It is not possible to have some GUI programs running in 16-bit protected mode and other GUI programs running in real mode. In
Windows 3.1 real mode disappeared.
32-bit protected mode
The
Intel 80386 introduced a significant advance in x86 architecture: an all
32-bit design supporting
paging. All of the registers, instructions, I/O space and memory are 32-bit. Memory is accessed through a 32-bit extension of protected mode. As in the 286, segment registers are used to index a segment table describing the division of memory. With a 32-bit offset, every application may access up to 4
GB (or more with
memory segments). In addition, 32-bit protected mode supports
paging, a mechanism making it possible to use
virtual memory. An exception to this design is the
Intel 80386SX, which is 32-bit with
24-bit addressing and a
16-bit data bus.
No new general-purpose registers were added. All 16-bit registers except the segment registers were expanded to 32 bits. This is represented by prefixing an "'E'" (for 'Extended') to the register
opcodes (thus the expanded AX became EAX, SI became ESI and so on). With a greater number of registers, instructions and operands, the
machine code format was expanded. To provide backward compatibility, segments with executable code can be marked as containing either 16 or 32 bit instructions. Special prefixes allow inclusion of 32-bit instructions in a 16-bit segment or vice versa.
Paging and segmented memory access are required for modern multitasking operating systems.
Linux,
386BSD and
Windows NT were developed for the 386 because it was the first Intel architecture CPU to support paging and 32-bit segment offsets. The 386 architecture became the basis of all further development in the x86 series. The success of
Windows 3.1, the first widely accepted version of
Microsoft Windows, was largely due to its ability to take advantage of 386 features, even though it was used mainly to run multiple sessions rather than to take advantage of the native 32-bit
instruction set.
The
Intel 80387 math co-processor was integrated into the next CPU in the series, the Intel 80486 (the 486SX, sold as a budget processor, had its co-processor disabled or removed). The new
floating point unit (FPU) performed
floating point calculations, important for scientific applications and graphic design.
MMX and beyond
Main articles: MMX
MMX is a
SIMD instruction set designed by Intel, introduced in 1997 for
Pentium MMX microprocessors. It developed out of a similar unit first used on the
Intel i860. It is supported on most subsequent IA-32 processors by Intel and other vendors. MMX is typically used for video applications.
MMX added 8 new
64-bit registers to the architecture, known as MM0 through MM7 (generically MMn). In reality, these new registers are aliases for the existing x87 FPU stack registers. Hence, anything done to the floating point stack also affects the MMX registers. Unlike the floating point stack, these MMn registers are
randomly accessible.
3DNow!
Main articles: 3DNow!
In 1997 AMD introduced 3DNow! which consisted of SIMD floating point instruction enhancements to MMX. The introduction of this technology coincided with the rise of
3D entertainment applications and was designed to improve the CPU's
vector processing performance of graphic-intensive applications. 3D video game developers and 3D graphics hardware vendors use 3DNow! to enhance their performance on AMD's
K6 and
Athlon series of processors.
SSE
Main articles: Streaming SIMD Extensions,
SSE2,
SSE3
In 1999, Intel introduced the Streaming SIMD Extensions (SSE)
instruction set which added eight new 128 bit registers (not overlaid with other registers) and 70 floating point instructions.
In 2000 Intel introduced the SSE2 instruction set, adding a complete complement of integer instructions (analogous to MMX) to the original SSE registers and 64-bit SIMD floating point instructions to the original SSE registers. The first addition made MMX almost obsolete, and the second allowed the instructions to be realistically targeted by conventional compilers.
Introduced in 2004 along with the
''Prescott'' revision of the
Pentium 4 processor, SSE3 added specific memory and
thread-handling instructions to boost the performance of Intel's
HyperThreading technology. AMD licensed the SSE3 instruction set and implemented most of the SSE3 instructions for its revision E and later Athlon 64 processors. The Athlon 64 does not support HyperThreading and lacks those SSE3 instructions used only for HyperThreading.
64-bit Long mode
Main articles: x86-64
By 2002, it was obvious that the 32-bit address space of the x86 architecture was limiting its performance in applications requiring large data sets. A 32-bit address space would allow the processor to directly address only 4 GB of data, a size surpassed by applications such as
video processing and
database engines, while using the 64-bit address, one can directly address 16777216
TiB (more than 16 billion MB) of data, although most 64-bit architectures don't support access to the full 64-bit address space (AMD64, for example, supports only 48 bits, split into 4 paging levels, from a 64-bit address).
AMD, who would traditionally follow the lead of Intel, took the initiative of extending the 32-bit x86 architecture to
64-bit, initially calling it ''x86-64'', later renaming it ''AMD64''. The
Opteron,
Athlon 64,
Turion 64, and later
Sempron families of processors use this architecture. The success of the AMD64 line of processors coupled with the lukewarm reception of the IA-64 architecture prompted Intel to reverse-engineer and adopt the instruction set, adding new extensions of its own and branding it the ''EM64T'' architecture, and later re-branding it ''Intel 64''.
In its literature and product version names, Microsoft and Sun refer to AMD64/Intel 64 collectively as ''x64'' in the Windows and
Solaris operating systems respectively.
Linux distributions refer to it either as "x86-64", its variant "x86_64", or "amd64".
BSD systems use "amd64" while
Mac OS X uses "x86_64".
This was the first time that a ''major'' upgrade of the x86 architecture was initiated and originated by a manufacturer other than Intel. It was also the first time that Intel accepted technology of this nature from an outside source.
Virtualization
x86
virtualization is difficult because the architecture did not meet the
Popek and Goldberg requirements until recently. Nevertheless, there are several commercial
x86 virtualization products, such as
VMware,
Parallels and
Microsoft Virtual PC, as well as
open source virtualization projects such as
Bochs,
QEMU. Other solutions, such as the
Kernel-based Virtual Machine ("KVM"), require newer processors which provide better hardware support for virtualization.
Intel and AMD have introduced x86 processors with hardware-based virtualization extensions that overcome the classical virtualization limitations of the x86 architecture. These extensions are known as
Intel VT (IVT or simply VT) that was code named "Vanderpool," and
AMD-V that was code named "Pacifica." Although most modern x86 server-based and many modern x86 desktop-based processors include these extensions, the technology is generally considered immature at this point with most software-based virtualization outperforming these extensions.
[8] This is expected to change as the technology matures.
See also
★
IA-32
★
x86 assembly language
★
x86 instruction listings
★
x87
★
Real mode —
Unreal mode —
Virtual 8086 mode —
Protected mode —
Long mode
★
x86-64
★
IA64
★
Microarchitecture
★
List of Intel microprocessors
★
List of AMD microprocessors
★
List of VIA microprocessors
★
List of x86 manufacturers
Footnotes
1. With the introduction of the Pentium brand in 1993, Intel ended its "80x86" naming scheme as ''numbers'' could not be trademarked. However, the term x86 was already firmly established among technicians, compiler writers etc.
2. Linux
★ Kernel Compiling
3. Intel Web page search result for "x64"
4. Intel's equivalents of the x86 and x86-64 have been the IA-32 and Intel 64 (EM64T or IA-32e) respectively. Likewise, AMD prefers the AMD64 name over the x86-64 they introduced themselves.
5. The embedded processor's market is populated by more than 20 different architectures, which, due to the price sensitivity, low power and hardware simplicity requirements, outnumber the x86.
6. Microprocessor Hall of Fame
7. It had a slower Floating point unit however, which is slightly ironic as Cyrix started out as a designer of fast Floating point units for x86 processors.
8. A Comparison of Software and Hardware Techniques for x86 Virtualization
References
★
★
External links
★
25 Years of Intel Architecture
★
x86 cpus' guide