'PostgreSQL' is an
object-relational database management system (ORDBMS). It is released under a
BSD-style license. As with many other open-source programs, PostgreSQL is not controlled by any single company, but relies on a global community of developers and companies to develop it.
PostgreSQL's unusual-looking name makes some readers pause when trying to pronounce it, especially those who pronounce
SQL as "sequel". PostgreSQL's developers pronounce it . (
Audio sample, 5.6k
MP3). It is also common to hear it abbreviated as simply "postgres", which was its original name. The name refers to the project's origins as a "post-Ingres" database, the original authors having also developed the
Ingres database.
History
PostgreSQL has had a lengthy evolution, starting with the
Ingres project at
UC Berkeley. The project leader,
Michael Stonebraker, had left Berkeley to commercialize Ingres in 1982, but eventually returned to academia. After returning to Berkeley in 1985, Stonebraker started a post-Ingres project to address the problems with contemporary database systems that had become increasingly clear during the early 1980s. While they share many of the same ideas, the code bases of PostgreSQL and Ingres started (and remain) completely separated.
The resulting project, named 'POSTGRES', aimed to introduce the minimum number of features needed to add complete support for types. These features included the ability to define types, but also the ability to fully describe relationships – something used widely before this time but maintained entirely by the user. In POSTGRES the database "understood" relationships, and could retrieve information in related tables in a natural way using ''rules''.
Starting in 1986 the team released a number of papers describing the basis of the system, and by 1988 the project had a prototype version up and running. The team released version 1 to a small number of users in June 1989, followed by version 2 with a re-written rules system in June 1990. 1991's version 3 re-wrote the rules system again, but also added support for multiple storage managers and for an improved query engine. By 1993 a huge number of users existed and began to overwhelm the project with requests for support and features. After releasing a Version 4 — primarily as a cleanup — the project ended.
Although the POSTGRES project had officially ended, the
BSD license (under which Berkeley had released POSTGRES) enabled
open-source developers to obtain copies and to develop the system further. In 1994 two
UC Berkeley graduate students, Andrew Yu and Jolly Chen, added a
SQL language interpreter to replace the earlier Ingres-based
QUEL system, creating Postgres95. The code was subsequently released to the web to find its own way in the world.
In July 1996, Marc Fournier at
Hub.Org Networking Services provided the first non-university development server for the open source development effort. Along with Bruce Momjian and Vadim B. Mikheev, work began to stabilize the code inherited from UC Berkeley, with the first open source version released on August 1st 1996.
1996 saw a re-naming of the project: in order to reflect the database's new SQL query language, Postgres95 became 'PostgreSQL'. The first PostgreSQL release formed version 6.0 in January 1997. Since then, a group of database developers and volunteers from around the world, coordinating via the
Internet, have maintained the software.
Although the license allowed for the commercialization of Postgres, the Postgres code did not develop commercially with the same rapidity as Ingres — somewhat surprisingly considering the advantages Postgres offered. The main offshoot originated when Paula Hawthorn (an original Ingres team member who moved from Ingres) and Michael Stonebraker formed
Illustra Information Technologies to commercialize Postgres.
In 2000, former Red Hat investors put together a company known as Great Bridge to commercialize PostgreSQL and compete against commercial database vendors. Great Bridge sponsored several PostgreSQL developers and donated many resources back to the community,
[1] however by late 2001 the company closed its doors citing tough competition from companies like Red Hat as well as poor market conditions.
[2]
In 2001, Command Prompt, Inc. released Mammoth PostgreSQL, the oldest surviving commercial PostgreSQL distribution. They continue to actively support the PostgreSQL community through developer sponsorships and projects including PL/Perl, PL/php, and hosting of community projects such as the
PostgreSQL Build Farm.
In January 2005, PostgreSQL received backing by another database vendor.
Pervasive Software, well known for their
Btrieve product which was ubiquitous on the
Novell NetWare platform, announced commercial support & community participation. While they achieved success for a time, in July 2006, Pervasive left the PostgreSQL support market.
[ Open letter to the PostgreSQL Community John Farr ]
In mid-2005 two other companies announced plans to commercialize PostgreSQL with focus on separate niche markets.
EnterpriseDB announced plans to focus on adding functionality to allow applications written to work with
Oracle to be more readily run atop PostgreSQL.
Greenplum contributed enhancements directed at
data warehouse and
business intelligence applications, notably including the BizGres project.
In October 2005, John Loiacono, executive vice-president of software at
Sun Microsystems, commented that "We're not going to
OEM Microsoft but we are looking at PostgreSQL right now,"
[3] although no specifics were released at that time. By November 2005, Sun Microsystems had announced support for PostgreSQL.
[2] As of June 2006, Sun Solaris 10 6/06 ships PostgreSQL.
As for the PostgreSQL project itself, it continues to make yearly major releases and minor "bugfix" releases, all available under the BSD license, based on contributions from both commercial vendors, support companies, and open source hackers at large.
Features
Functions
Functions allow blocks of code to be executed by the server. Although these blocks can be written in SQL, the lack of basic programming operations, such as
branching and
looping, has driven the adoption of other languages inside of functions. Some of the languages can even execute inside of triggers. Functions in PostgreSQL can be written in the following languages:
★ A built-in language called
PL/pgSQL resembles Oracle's procedural language
PL/SQL.
★ Scripting languages are supported through
PL/Perl,
plPHP,
PL/Python,
PL/Ruby,
PL/sh,
PL/Tcl and
PL/Scheme.
★ Compiled languages
C,
C++, or Java (via
PL/Java).
★ The statistical language
R through
PL/R.
PostgreSQL supports row-returning functions, where the output of the function is a set of values which can be treated much like a table within queries.
Functions can be defined to execute with the privileges of either the caller or the user who defined the function. Functions are sometimes referred to as ''
stored procedures'', although there is a slight technical distinction between the two.
Indices
User-defined
index methods can be created, or the built-in
B-tree,
hash table and
GiST indices can be used. Indexes in PostgreSQL also support the following features:
★ PostgreSQL is capable of scanning indexes backwards when needed; a separate index is never needed to support
ORDER BY ''field'' DESC.
★ '
Expression indexes' can be created with an index of the result of an expression or function, instead of simply the value of a column.
★ '
Partial indexes', which only index part of a table, can be created by adding a
WHERE clause to the end of the
CREATE INDEX statement. This allows a smaller index to be created.
★ '
Bitmap index scans' are supported as of version 8.1. This involves reading multiple indexes and generating a bitmap that expresses their
intersection with the
tuples that match the selection criteria. This provides a way of composing indexes together; on a table with 20 columns, there are, in principle, 20
! indexes that could be defined — which is far too many to actually use. If one index is created on each column, bitmap scans can compose arbitrary combinations of those indexes at query time for each column that seems worth considering as a constraint.
Triggers
Triggers are events triggered by the action of SQL query. For example, an INSERT query might activate a trigger that checked if the values of the query were valid. Most triggers are only activated by either INSERT or UPDATE queries.
Triggers are fully supported and can be attached to tables but not to views. Views can have rules, though. Multiple triggers are fired in alphabetical order. In addition to calling functions written in the native PL/PgSQL, triggers can also invoke functions written in other languages like PL/Perl.
MVCC
PostgreSQL manages
concurrency through a system known as
Multi-Version Concurrency Control (MVCC), which gives each user a "snapshot" of the database, allowing changes to be made without being visible to other users until a transaction is committed. This largely eliminates the need for read locks, and ensures the database maintains the
ACID principles in an efficient manner.
Rules
Rules allow the "query tree" of an incoming query to be rewritten. One common usage is to implement updatable views.
Data types
A wide variety of native
data types are supported, including:
★
Arbitrary precision numerics
★ Unlimited length text
★ Geometric primitives
★
IP and
IPv6 addresses
★
CIDR blocks, and
MAC address data types
★ Arrays
In addition, users can create their own data types which can usually be made fully indexable via PostgreSQL's
GiST infrastructure.
Examples of these are the
Geographic information system (GIS) data types from the
PostGIS project for PostgreSQL.
User-defined objects
New types of almost all objects inside the database can be created, including:
★ Indices
★ Operators (existing ones can be
overloaded)
★
Aggregate functions
★
Domains
★ Casts
★ Conversions
Inheritance
Tables can be set to inherit their characteristics from a "parent" table. Data is shared between "parent" and "child(ren)" tables. Tuples inserted or deleted in the "child" table will respectively be inserted or deleted in the "parent" table. Also adding a column in the parent table will cause that column to appear in the child table as well. This feature is not fully supported yet—in particular, table constraints are not currently inheritable. This means that attempting to insert the id of a row from a child table into table that has a foreign key constraint referencing a parent table will fail because Postgres doesn't recognize that the id from the child table is also a valid id in the parent table.
Inheritance provides a way to map the features of generalization hierarchies depicted in Entity Relationship Diagrams (ERD) directly into the PostgreSQL database.
Other features
★
Referential integrity constraints including
foreign key constraints, column constraints, and row checks
★
Views While updateable views have not been implemented, the same functionality can be achieved using the rules system.
★ Full, inner, and outer (left and right)
joins
★ Sub-
selects
★
Transactions
★ Supports most of the major features of standard
[1] unsupported supported <-- lead to documentation for the ''next'' release of PostgreSQL, follow
this link to find manuals for ''already released'' versions of PostgreSQL
★ Encrypted connections via
SSL
★ Binary and textual large-object storage
★ Online backup
★
Domains
★
Tablespaces
★
Savepoints
★
Point-in-time recovery
★
Two-phase commit
★ TOAST ('T'he 'O'versized-'A'ttribute 'S'torage 'T'echnique) is used to transparently store large table attributes (such as big MIME attachments or XML messages) in a separate area, with automatic compression.
★
Regular expressions
[2]
Add-ons
★ Geographic objects via
PostGIS. GPL.
★
Full text search via
Tsearch2 and
OpenFTS. GPL.
★ Several asynchronous master/slave replication packages, including
★
★
Slony-I (BSD license)
★
★
pgcluster (BSD license)
★
★
Mammoth Replicator.
★ XML/XSLT support via
XPath Extensions in the contrib section. GPL.
Benchmarks
Many informal performance studies of PostgreSQL have been done
[ PostgreSQL publishes first real benchmark Josh Berkus ] but the first industry-standard and peer-validated benchmark was completed in June 2007 using the Sun Java Systems Application Server 9.0 Platform Edition,
UltraSPARC T1 based
Sun Fire server and Postgres 8.2
[ SPECjAppServer®2004 Result ]. This result of 778.14 SPECjAppServer2004 JOPS@Standard compares favourably with the 874 JOPS@Standard with Oracle 10 on an
Itanium based
HP-UX [ PostgreSQL publishes first real benchmark Josh Berkus ]
In August 2007, Sun submitted an improved benchmark score of 813.73 SPECjAppServer2004 JOPS@Standard. With the
system under test at a reduced price, the price/performance improved from $US 84.98/JOPS to $US 70.57/JOPS.
[5]
Prominent users
★
.org,
.info,
.mobi and
.aero domain registry via
Afilias [6]
★ The
American Chemical Society
★
BASF
★
IMDB
★
Skype
★
TiVo
★
Penny Arcade
★
Sony Online [7]
★
U.S. Department of Labor
★
USPS
★
VeriSign
★
Wisconsin Circuit Court Access with 6
★ 180GB DBs replicated in real time
★
OpenACS and
.LRN
References
★
Beginning Databases with PostgreSQL, , Neil, Matthew, , , ISBN 1-59059-478-9
★
Beginning PHP and PostgreSQL 8: From Novice to Professional, , W. Jason, Gilmore, , , ISBN 1-59059-547-5
★
Practical PostgreSQL, , John C., Worsley, , , ISBN 1-56592-846-6
★
PostgreSQL, , Korry, Douglas, , , ISBN 0-672-32756-2
Notes
1. Interview: Bruce Momjian Maya Tamiya
2.
3. Sun's software chief eyes databases, groupware
4.
5. SPECjAppServer®2004 Result
6. PostgreSQL affiliates .ORG domain
7. Sony Online opts for open-source database over Oracle
External links
About PostgreSQL
★
★
Planet PostgreSQL, blog aggregator
★
Database Journal articles on PostgreSQL
★
Linux Productivity Magazine: a complete issue on PostgreSQL
★
a rebuttal to the FUD (fear, uncertainty, and doubt) surrounding much of the criticism against PostgreSQL.
★
PostgreSQL gotchas, documented but counterintuitive behavior
★
Test_PGC, Example embedded SQL/C program for PostgreSQL showing database operations and SQLSTATE testing.
External PostgreSQL-related projects
The developers of PostgreSQL try to keep the system itself down to a set of "core" features, rather than encouraging extensions to be rolled into the main system. Here are places where "secondary" projects are managed:
★
PgFoundry PostgreSQL-related projects
★
SourceForge PostgreSQL-related projects
PostgreSQL documentation
★
PostgreSQL FAQ (Frequently Asked Questions)
★
PostgreSQL Website
★
PostgreSQL Documentation
Performance tuning documentation
★
PostgreSQL Performance Tuning
★
Tuning PostgreSQL for performance
★
Annotated POSTGRESQL.CONF Guide for PostgreSQL