Product SiteDocumentation Site

Chapter 24. RPM Package File Structure

24.1. The Package File
24.1.1. The file identifier
24.1.2. The signature
24.1.3. The header
24.1.4. The payload
This appendix covers:
This appendix describes the format of RPM package files. You can combine this information with C, Perl, or Python data structures to access the information. In all cases, you should access elements in an RPM file using one of the available programming libraries. Do not attempt to access the files directly, as you may inadvertently damage the RPM file.
Cross Reference
Chapter 15, Programming RPM with C, Chapter 16, Programming RPM with Python, and Chapter 17, Programming RPM with Perl cover programming with C, Python, and Perl, respectively.
The RPM package format described here has been standardized as part of the Linux Standards Base, or LSB, version 1.3.
Cross Reference
The LSB 1.3 section on package file formats is available at www.linuxbase.org/spec/refspecs/LSB_1.3.0/gLSB/gLSB.html#PACKAGEFMT.

24.1. The Package File

RPM packages are delivered with one file per package. All RPM files have the following basic format of four section:
*A lead or file identifier
*A signature
*Header information
*Archive of the payload, the files to install
All values are encoded in network byte order, for portability to multiple processor architectures.

24.1.1. The file identifier

Also called the lead or the rpmlead, the identifier marks that this file is an RPM file. It contains a magic number that the file command uses to detect RPM files. It also contains version and architecture information.
The start of the identifier is the so-called magic number. The file command reads the first few bytes of a file and compares the values found with the contents of /usr/share/magic (/etc/magic on many UNIX systems), a database of magic numbers. This allows the file command to quickly identify files.
The identifier includes the RPM version number, that is, the version of the RPM file format used for the package. The identifier also has a flag that tells the type of the RPM file, whether the file contains a binary or source package. An architecture flag allows RPM software to double-check that you are not trying to install a package for a non-compatible architecture.

24.1.2. The signature

The signature appears after the lead or identifier section. The RPM signature helps verify the integrity of the package, and optionally the authenticity.
The signature works by performing a mathematical function on the header and archive section of the file. The mathematical function can be an encryption process, such as PGP (Pretty Good Privacy), or a message digest in MD5 format.

24.1.3. The header

The identifier section no longer contains enough information to describe modern RPMs. Furthermore, the identifier section is nowhere near as flexible as today’s packages require. To counter these deficiencies, the header section was introduced to include more information about the package.
The header structure contains three parts:
*Header record
*One or more header index record structures
*Data for the index record structures
The header record identifies this as the RPM header. It also contains a count of the number of index records and the size of the index record data.
Each index record uses a structure that contains a tag number for the data it contains. This includes tag IDs for the copyright message, name of the package, version number, and so on. A type number identifies the type of the item. An offset indicates where in the data section the data for this header item begins. A count indicates how many items of the given type are in this header entry. You can multiply the count by the size of the type to get the number of bytes used for the header entry.
Table D-1 lists the type identifiers.
Table D-1 Header type identifiers
Constant
Value
Size in Bytes
RPM_NULL_TYPE
0
No size
RPM_CHAR_TYPE
1
1
RPM_INT8_TYPE
2
1
RPM_INT16_TYPE
3
2
RPM_INT32_TYPE
4
4
RPM_INT64_TYPE
5
Not supported yet
RPM_STRING_TYPE
6
Variable number of bytes, terminated by a NULL
RPM_BIN_TYPE
7
1
RPM_STRING_ARRAY_TYPE
8
Variable, vector of NULL-terminated strings
RPM_I18NSTRING_TYPE
9
Variable, vector of NULL-terminated strings
Note
Integer values are aligned on 2-byte (16-bit integers) or 4-byte (32-bit integers) boundaries.

24.1.3.1. Header Tags

Table D-2 lists the tag identifiers.
Table D-2 Header entry tag identifiers
Constant
Value
Type
Required?
RPMTAG_NAME
1000
STRING
Yes
RPMTAG_VERSION
1001
STRING
Yes
RPMTAG_RELEASE
1002
STRING
Yes
RPMTAG_SUMMARY
1004
I18NSTRING
Yes
RPMTAG_DESCRIPTION
1005
I18NSTRING
Yes
RPMTAG_BUILDTIME
1006
INT32
Optional
RPMTAG_BUILDHOST
1007
STRING
Optional
RPMTAG_SIZE
1009
INT32
Yes
RPMTAG_LICENSE
1014
STRING
Yes
RPMTAG_GROUP
1016
I18NSTRING
Yes
RPMTAG_OS
1021
STRING
Yes
RPMTAG_ARCH
1022
STRING
Yes
RPMTAG_SOURCERPM
1044
STRING
Optional
RPMTAG_FILEVERIFYFLAGS
1045
INT32
Optional
RPMTAG_ARCHIVESIZE
1046
INT32
Optional
RPMTAG_RPMVERSION
1064
STRING
Optional
RPMTAG_CHANGELOGTIME
1080
INT32
Optional
RPMTAG_CHANGELOGNAME
1081
STRING_ARRAY
Optional
RPMTAG_CHANGELOGTEXT
1082
STRING_ARRAY
Optional
RPMTAG_COOKIE
1094
STRING
Optional
RPMTAG_OPTFLAGS
1122
STRING
Optional
RPMTAG_PAYLOADFORMAT
1124
STRING
Yes
RPMTAG_PAYLOADCOMPRESSOR
1125
STRING
Yes
RPMTAG_PAYLOADFLAGS
1126
STRING
Yes
RPMTAG_RHNPLATFORM
1131
STRING
Deprecated
RPMTAG_PLATFORM
1132
STRING
Optional
Most of these tags are self-explanatory; however, a few tags hold special meaning. The RPMTAG_SIZE tag holds the size of all the regular files in the payload. The RPMTAG_ARCHIVESIZE tag holds the uncompressed size of the payload section, including the necessary cpio headers. The RPMTAG_COOKIE tag holds an opaque string.
According to the LSB standards, the RPMTAG_PAYLOADFORMAT must always be cpio. The RPMTAG_PAYLOADCOMPRESSOR must be gzip. The RPMTAG_PAYLOADFLAGS must always be 9.
The RPMTAG_OPTFLAGS tag holds special compiler flags used to build the package. The RPMTAG_PLATFORM and RPMTAG_RHNPLATFORM tags hold opaque strings.

24.1.3.2. Private Header Tags

Table D-3 lists header tags that are considered private.
Table D-3 Private header tags
Constant
Value
Type
Required?
RPMTAG_HEADERSIGNATURES
62
BIN
Optional
RPMTAG_HEADERIMMUTABLE
63
BIN
Optional
RPMTAG_HEADERI18NTABLE
100
STRING_ARRAY
Yes
The RPMTAG_HEADERSIGNATURES tag indicates that this is a signature entry. The RPMTAG_HEADERIMMUTABLE tag indicates a header item that is used in the calculation of signatures. This data should be preserved.
The RPMTAG_HEADERI18NTABLE tag holds a table of locales used for international text lookup.

24.1.3.3. Signature Tags

The signature section is implemented as a header structure, but it is not considered part of the RPM header. Table D-4 lists special signature-related tags.
Table D-4 Signature-related tags
Constant
Value
Type
Required?
SIGTAG_SIGSIZE
1000
INT32
Yes
SIGTAG_PGP
1002
BIN
Optional
SIGTAG_MD5
1004
BIN
Yes
SIGTAG_GPG
1005
BIN
Optional
SIGTAG_PAYLOADSIZE
1007
INT32
Optional
SIGTAG_SHA1HEADER
1010
STRING
Optional
SIGTAG_DSAHEADER
1011
BIN
Optional
SIGTAG_RSAHEADER
1012
BIN
Optional
The SIGTAG_SIGSIZE tag specifies the size of the header and payload section, while the SIGTAG_PAYLOADSIZE holds the uncompressed size of the payload.
To verify the integrity of the package, the SIGTAG_MD5 tag holds a 128-bit MD5 checksum of the header and payload section. The SIGTAG_SHA1HEADER holds an SHA1 checksum of the entire header section.
To verify the authenticity of the package, the SIGTAG_PGP tag holds a Version 3 OpenPGP Signature Packet RSA signature of the header and payload areas. The SIGTAG_GPG tag holds a Version 3 OpenPGP Signature Packet DSA signature of the header and payload areas. The SIGTAG_DSAHEADER holds a DSA signature of just the header section. If the SIGTAG_DSAHEADER tag is included, the SIGTAG_GPG tag must also be present. The SIGTAG_ RSAHEADER holds an RSA signature of just the header section. If the SIGTAG_ RSAHEADER tag is included, the SIGTAG_PGP tag must also be present.

24.1.3.4. Installation Tags

A set of installation-specific tags tells the rpm program how to run the pre- and post-installation scripts. Table D-5 lists these tags.
Table D-5 Installation tags
Constant
Value
Type
Required?
RPMTAG_PREINPROG
1085
STRING
Optional
RPMTAG_POSTINPROG
1086
STRING
Optional
RPMTAG_PREUNPROG
1087
STRING
Optional
RPMTAG_POSTUNPROG
1088
STRING
Optional
The RPMTAG_PREINPROG tag holds the name of the interpreter, such as sh, to run the pre-install script. Similarly, the RPMTAG_POSTINPROG tag holds the name of the interpreter to run the post-install script. RPMTAG_PREUNPROG and RPMTAG_POSTUNPROG are the same for the uninstall scripts.

24.1.3.5. File Information Tags

File information tags are placed in the header for convenient access. These tags describe the files in the payload. Table D-6 lists these tags.
Table D-6 File information tags
Constant
Value
Type
Required?
RPMTAG_OLDFILENAMES
1027
STRING_ARRAY
Optional
RPMTAG_FILESIZES
1028
INT32
Yes
RPMTAG_FILEMODES
1030
INT16
Yes
RPMTAG_FILERDEVS
1033
INT16
Yes
RPMTAG_FILEMTIMES
1034
INT32
Yes
RPMTAG_FILEMD5S
1035
STRING_ARRAY
Yes
RPMTAG_FILELINKTOS
1036
STRING_ARRAY
Yes
RPMTAG_FILEFLAGS
1037
INT32
Yes
RPMTAG_FILEUSERNAME
1039
STRING_ARRAY
Yes
RPMTAG_FILEGROUPNAME
1040
STRING_ARRAY
Yes
RPMTAG_FILEDEVICES
1095
INT32
Yes
RPMTAG_FILEINODES
1096
INT32
Yes
RPMTAG_FILELANGS
1097
STRING_ARRAY
Yes
RPMTAG_DIRINDEXES
1116
INT32
Optional
RPMTAG_BASENAMES
1117
STRING_ARRAY
Optional
RPMTAG_DIRNAMES
1118
STRING_ARRAY
Optional
The RPMTAG_OLDFILENAMES tag is used when the files are not compressed, when the RPMTAG_REQUIRENAME tag does not indicate rpmlib(CompressedFileNames). The RPMTAG_FILESIZES tag specifies the size of each file in the payload, while the RPMTAG_FILEMODES tag specifies the file modes (permissions) and the RPMTAG_FILEMTIMES tag holds the last modification time for each file.
The RPMTAG_BASENAMES tag holds an array of the base file names for the files in the payload. The RPMTAG_DIRNAMES tag holds an array of the directories for the files. The RPMTAG_DIRINDEXES tag contains an index into the RPMTAG_DIRNAMES for the directory. Each RPM must have either RPMTAG_OLDFILENAMES or the triple of RPMTAG_BASENAMES, RPMTAG_DIRNAMES, and RPMTAG_DIRINDEXES, but not both.

24.1.3.6. Dependency Tags

The dependency tags provide one of the most useful features of the RPM system by allowing for automated dependency checks between packages. Table D-7 lists these tags.
Table D-7 Dependency tags
Constant
Value
Type
Required?
RPMTAG_PROVIDENAME
1047
STRING_ARRAY
Yes
RPMTAG_REQUIREFLAGS
1048
INT32
Yes
RPMTAG_REQUIRENAME
1049
STRING_ARRAY
Yes
RPMTAG_REQUIREVERSION
1050
STRING_ARRAY
Yes
RPMTAG_CONFLICTFLAGS
1053
INT32
Optional
RPMTAG_CONFLICTNAME
1054
STRING_ARRAY
Optional
RPMTAG_CONFLICTVERSION
1055
STRING_ARRAY
Optional
RPMTAG_OBSOLETENAME
1090
STRING_ARRAY
Optional
RPMTAG_PROVIDEFLAGS
1112
INT32
Yes
RPMTAG_PROVIDEVERSION
1113
STRING_ARRAY
Yes
RPMTAG_OBSOLETEFLAGS
1114
INT32
Optional
RPMTAG_OBSOLETEVERSION
1115
INT32
Optional
Each of these tags comes in triples, which are formatted similarly. The RPMTAG_REQUIRENAME tag holds an array of required capabilities. The RPMTAG_REQUIREVERSION tag holds an array of the versions of the required capabilities. The RPMTAG_REQUIREFLAGS tag ties the two together with a set of bit flags that specify whether the requirement is for a version less than the given number, equal to the given number, greater than or equal to the given number, and so on. Table D-8 lists these flags.
Table D-8 Bit flags for dependencies
Flag
Value
RPMSENSE_LESS
0x02
RPMSENSE_GREATER
0x04
RPMSENSE_EQUAL
0x08
RPMSENSE_PREREQ
0x40
RPMSENSE_INTERP
0x100
RPMSENSE_SCRIPT_PRE
0x200
RPMSENSE_SCRIPT_POST
0x400
RPMSENSE_SCRIPT_PREUN
0x800
RPMSENSE_SCRIPT_POSTUN
0x1000
The RPMTAG_PROVIDENAME, RPMTAG_PROVIDEVERSION, and RPMTAG_PROVIDEFLAGS tags work similarly for the capabilities this package provides. The RPMTAG_CONFLICTNAME, RPMTAG_CONFLICTVERSION, and RPMTAG_CONFLICTFLAGS tags specify the conflicts. The RPMTAG_OBSOLETENAME, RPMTAG_OBSOLETEVERSION, and RPMTAG_OBSOLETEFLAGS tags specify the obsoleted dependencies.
In addition, an RPM package can define some special requirements in the RPMTAG_REQUIRENAME and RPMTAG_REQUIREVERSION tags. Table D-9 lists these requirements.
Table D-9 Special package requirement names and versions
Name
Version
Specifies
Lsb
1.3
The package conforms to the Linux Standards Base RPM format.
rpmlib(VersionedDependencies)
3.0.3-1
The package holds dependencies or prerequisites that have versions associated with them.
rpmlib(PayloadFilesHavePrefix)
4.0-1
File names in the archive have a “.” prepended on the names.
rpmlib(CompressedFileNames)
3.0.4-1
The package uses the RPMTAG_DIRINDEXES, RPMTAG_DIRNAME and RPMTAG_BASENAMES tags for specifying file names.
/bin/sh
NA
Indicates a requirement for the Bourne shell to run the installation scripts.

24.1.4. The payload

The payload, or archive, section contains the actual files used in the package. These are the files that the rpm command installs when you install the package. To save space, data in the archive section is compressed in GNU gzip format.
Once uncompressed, the data is in cpio format, which is how the rpm2cpio command can do its work. In cpio format, the payload is made up of records, one per file. Table D-10 lists the record structure.
Table D-10 cpio file record structure
Element
Holds
cpio header
Information on the file, such as the file mode (permissions)
File name
NULL-terminated string
Padding
0 to 3 bytes, as needed, to align the next element on a 4-byte boundary
File data
The contents of the file
Padding
0 to 3 bytes, as needed, to align the next file record on a 4-byte boundary
The information in the cpio header duplicates that of the RPM file-information header elements.