这个据说是PostgreSQL的control file。

到底如何呢,先看看改名后如何,把pg_control文件改名,然后启动 Postgres,运行时得到信息:

[postgres@pg101 bin]$ postgres: could not find the database system
Expected to find it in the directory "/usr/local/pgsql/bin/../data",
but could not open file "/usr/local/pgsql/bin/../data/global/pg_control": No Such file or Directory

对应的源代码,在postmater.c的 checkDataDir方法中:

        snprintf(path, sizeof(path), "%s/global/pg_control", DataDir);

        fp = AllocateFile(path, PG_BINARY_R);
if (fp == NULL)
{
write_stderr("%s: could not find the database system\n"
"Expected to find it in the directory \"%s\",\n"
"but could not open file \"%s\": %s\n",
progname, DataDir, path, strerror(errno));
ExitPostmaster();
}
FreeFile(fp);

将 pg_control文件改回原来的名字后,重新启动PostgreSQL数据库,没有问题。

而在main.c中,有如下代码:

从注释中可以看到,数据库中初始化后,会有LC_CTYPE/LC_COLLATE等信息已经写入到pg_control文件中。

        /*
* Set up locale information from environment. Note that LC_CTYPE and
* LC_COLLATE will be overridden later from pg_control if we are in an
* already-initialized database. We set them here so that they will be
* available to fill pg_control during initdb. LC_MESSAGES will get set
* later during GUC option processing, but we set it here to allow startup
* error messages to be localized.
*/ set_pglocale_pgservice(argv[], PG_TEXTDOMAIN("postgres"));

在 src/backend/access/transam/xlog.c 中,有如下代码:

/*
* We maintain an image of pg_control in shared memory.
*/
static ControlFileData *ControlFile = NULL;

可见,与pg_control文件相对应,在内存中保留着一个内存结构。

它长得是这个样子:

/*
* Contents of pg_control.
*
* NOTE: try to keep this under 512 bytes so that it will fit on one physical
* sector of typical disk drives. This reduces the odds of corruption due to
* power failure midway through a write.
*/ typedef struct ControlFileData
{
/*
* Unique system identifier --- to ensure we match up xlog files with the
* installation that produced them.
*/
uint64 system_identifier; /*
* Version identifier information. Keep these fields at the same offset,
* especially pg_control_version; they won't be real useful if they move
* around. (For historical reasons they must be 8 bytes into the file
* rather than immediately at the front.)
*
* pg_control_version identifies the format of pg_control itself.
* catalog_version_no identifies the format of the system catalogs.
*
* There are additional version identifiers in individual files; for
* example, WAL logs contain per-page magic numbers that can serve as
* version cues for the WAL log.
*/
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */ /*
* System status data
*/
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */ CheckPoint checkPointCopy; /* copy of last check point record */ /*
* These two values determine the minimum point we must recover up to
* before starting up:
*
* minRecoveryPoint is updated to the latest replayed LSN whenever we
* flush a data change during archive recovery. That guards against
* starting archive recovery, aborting it, and restarting with an earlier
* stop location. If we've already flushed data changes from WAL record X
* to disk, we mustn't start up until we reach X again. Zero when not
* doing archive recovery.
*
* backupStartPoint is the redo pointer of the backup start checkpoint, if
* we are recovering from an online backup and haven't reached the end of
* backup yet. It is reset to zero when the end of backup is reached, and
* we mustn't start up before that. A boolean would suffice otherwise, but
* we use the redo pointer as a cross-check when we see an end-of-backup
* record, to make sure the end-of-backup record corresponds the base
* backup we're recovering from.
*/
XLogRecPtr minRecoveryPoint;
XLogRecPtr backupStartPoint; /*
* Parameter settings that determine if the WAL can be used for archival
* or hot standby.
*/
int wal_level;
int MaxConnections;
int max_prepared_xacts;
int max_locks_per_xact; /*
* This data is used to check for hardware-architecture compatibility of
* the database and the backend executable. We need not check endianness
* explicitly, since the pg_control version will surely look wrong to a
* machine of different endianness, but we do need to worry about MAXALIGN
* and floating-point format. (Note: storage layout nominally also
* depends on SHORTALIGN and INTALIGN, but in practice these are the same
* on all architectures of interest.)
*
* Testing just one double value is not a very bulletproof test for
* floating-point compatibility, but it will catch most cases.
*/
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
#define FLOATFORMAT_VALUE 1234567.0 /*
* This data is used to make sure that configuration of this database is
* compatible with the backend executable.
*/
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */ uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */ uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */ uint32 toast_max_chunk_size; /* chunk size in TOAST tables */ /* flag indicating internal format of timestamp, interval, time */
bool enableIntTimes; /* int64 storage enabled? */ /* flags indicating pass-by-value status of various types */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */ /* CRC of all above ... MUST BE LAST! */
pg_crc32 crc;
} ControlFileData;

然后,一个一个地看吧。

04-17 05:16