14.4.
Populating a Database

One might need to insert a large amount of data when first
populating a database. This section contains some suggestions on
how to make this process as efficient as possible.

14.4.1. Disable
Autocommit

Turn off autocommit and just do one commit at the end. (In
plain SQL, this means issuing BEGIN at
the start and COMMIT at the end. Some
client libraries might do this behind your back, in which case
you need to make sure the library does it when you want it
done.) If you allow each insertion to be committed separately,
PostgreSQL is doing a lot of
work for each row that is added. An additional benefit of doing
all insertions in one transaction is that if the insertion of
one row were to fail then the insertion of all rows inserted up
to that point would be rolled back, so you won't be stuck with
partially loaded data.

14.4.2. Use COPY

Use COPY to load all
the rows in one command, instead of using a series of
INSERT commands. The COPY command is optimized for loading large
numbers of rows; it is less flexible than INSERT, but incurs significantly less overhead
for large data loads. Since COPY is a
single command, there is no need to disable autocommit if you
use this method to populate a table.

If you cannot use COPY, it might
help to use PREPARE
to create a prepared INSERT statement,
and then use EXECUTE as many times as
required. This avoids some of the overhead of repeatedly
parsing and planning INSERT. Different
interfaces provide this facility in different ways; look for
"prepared statements" in the
interface documentation.

Note that loading a large number of rows using COPY is almost always faster than using
INSERT, even if PREPARE is used and multiple insertions are
batched into a single transaction.

COPY is fastest when used within
the same transaction as an earlier CREATE
TABLE
or TRUNCATE command. In
such cases no WAL needs to be written, because in case of an
error, the files containing the newly loaded data will be
removed anyway. However, this consideration does not apply when
archive_mode is
set, as all commands must write WAL in that case.

发表回复