dpkg-tech.sgml 22 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512
  1. <!doctype debiandoc PUBLIC "-//DebianDoc//DTD DebianDoc//EN">
  2. <book>
  3. <title>dpkg technical manual</title>
  4. <author>Tom Lees <email>tom@lpsg.demon.co.uk</email></author>
  5. <version>$Id: dpkg-tech.sgml,v 1.3 2003/02/12 15:05:45 doogie Exp $</version>
  6. <abstract>
  7. This document describes the minimum necessary workings for the APT dselect
  8. replacement. It gives an overall specification of what its external interface
  9. must look like for compatibility, and also gives details of some internal
  10. quirks.
  11. </abstract>
  12. <copyright>
  13. Copyright &copy; Tom Lees, 1997.
  14. <p>
  15. APT and this document are free software; you can redistribute them and/or
  16. modify them under the terms of the GNU General Public License as published
  17. by the Free Software Foundation; either version 2 of the License, or (at your
  18. option) any later version.
  19. <p>
  20. For more details, on Debian systems, see the file
  21. /usr/share/common-licenses/GPL for the full license.
  22. </copyright>
  23. <toc sect>
  24. <chapt>Quick summary of dpkg's external interface
  25. <sect id="control">Control files
  26. <p>
  27. The basic dpkg package control file supports the following major features:-
  28. <list>
  29. <item>5 types of dependencies:-
  30. <list>
  31. <item>Pre-Depends, which must be satisfied before a package may be
  32. unpacked
  33. <item>Depends, which must be satisfied before a package may be
  34. configured
  35. <item>Recommends, to specify a package which if not installed may
  36. severely limit the usefulness of the package
  37. <item>Suggests, to specify a package which may increase the
  38. productivity of the package
  39. <item>Conflicts, to specify a package which must NOT be installed
  40. in order for the package to be configured
  41. <item>Breaks, to specify a package which is broken by the
  42. package and which should therefore not be configured while broken
  43. </list>
  44. Each of these dependencies can specify a version and a depedency on that
  45. version, for example "<= 0.5-1", "== 2.7.2-1", etc. The comparators available
  46. are:-
  47. <list>
  48. <item>"&lt;&lt;" - less than
  49. <item>"&lt;=" - less than or equal to
  50. <item>"&gt;&gt;" - greater than
  51. <item>"&gt;=" - greater than or equal to
  52. <item>"==" - equal to
  53. </list>
  54. <item>The concept of "virtual packages", which many other packages may provide,
  55. using the Provides mechanism. An example of this is the "httpd" virtual package,
  56. which all web servers should provide. Virtual package names may be used in
  57. dependency headers. However, current policy is that virtual packages do not
  58. support version numbers, so dependencies on virtual packages with versions
  59. will always fail.
  60. <item>Several other control fields, such as Package, Version, Description,
  61. Section, Priority, etc., which are mainly for classification purposes. The
  62. package name must consist entirely of lowercase characters, plus the characters
  63. '+', '-', and '.'. Fields can extend across multiple lines - on the second
  64. and subsequent lines, there is a space at the beginning instead of a field
  65. name and a ':'. Empty lines must consist of the text " .", which will be
  66. ignored, as will the initial space for other continuation lines. This feature
  67. is usually only used in the Description field.
  68. </list>
  69. <sect>The dpkg status area
  70. <p>
  71. The "dpkg status area" is the term used to refer to the directory where dpkg
  72. keeps its various status files (GNU would have you call it the dpkg shared
  73. state directory). This is always, on Debian systems, /var/lib/dpkg. However,
  74. the default directory name should not be hard-coded, but #define'd, so that
  75. alteration is possible (it is available via configure in dpkg 1.4.0.9 and
  76. above). Of course, in a library, code should be allowed to override the
  77. default directory, but the default should be part of the library (so that
  78. the user may change the dpkg admin dir simply by replacing the library).
  79. <p>
  80. Dpkg keeps a variety of files in its status area. These are discussed later
  81. on in this document, but a quick summary of the files is here:-
  82. <list>
  83. <item>available - this file contains a concatenation of control information
  84. from all the packages which dpkg knows about. This is updated using the dpkg
  85. commands "--update-avail &lt;file&gt;", "--merge-avail &lt;file&gt;", and
  86. "--clear-avail".
  87. <item>status - this file contains information on the following things for
  88. every package:-
  89. <list>
  90. <item>Whether it is installed, not installed, unpacked, removed,
  91. failed configuration, or half-installed (deconfigured in
  92. favour of another package).
  93. <item>Whether it is selected as install, hold, remove, or purge.
  94. <item>If it is "ok" (no installation problems), or "not-ok".
  95. <item>It usually also contains the section and priority (so that
  96. dselect may classify packages not in available)
  97. <item>For packages which did not initially appear in the "available"
  98. file when they were installed, the other control information
  99. for them.
  100. </list>
  101. <p>
  102. The exact format for the "Status:" field is:
  103. <example>
  104. Status: Want Flag Status
  105. </example>
  106. Where <var>Want</> may be one of <em>unknown</>, <em>install</>,
  107. <em>hold</>, <em>deinstall</>, <em>purge</>. <var>Flag</>
  108. may be one of <em>ok</>, <em>reinstreq</>, <em>hold</>,
  109. <em>hold-reinstreq</>.
  110. <var>Status</> may be one of <em>not-installed</>, <em>unpacked</>,
  111. <em>half-configured</>, <em>installed</>, <em>half-installed</>
  112. <em>config-files</>, <em>post-inst-failed</>, <em>removal-failed</>.
  113. The states are as follows:-
  114. <taglist>
  115. <tag>not-installed
  116. <item>No files are installed from the package, it has no config files
  117. left, it uninstalled cleanly if it ever was installed.
  118. <tag>unpacked
  119. <item>The basic files have been unpacked (and are listed in
  120. /var/lib/dpkg/info/[package].list. There are config files present,
  121. but the postinst script has _NOT_ been run.
  122. <tag>half-configured
  123. <item>The package was installed and unpacked, but the postinst script
  124. failed in some way.
  125. <tag>installed
  126. <item>All files for the package are installed, and the configuration
  127. was also successful.
  128. <tag>half-installed
  129. <item>An attempt was made to remove the packagem but there was a failure
  130. in the prerm script.
  131. <tag>config-files
  132. <item>The package was "removed", not "purged". The config files are left,
  133. but nothing else.
  134. <tag>post-inst-failed
  135. <item>Old name for half-configured. Do not use.
  136. <tag>removal-failed
  137. <item>Old name for half-installed. Do not use.
  138. </taglist>
  139. The two last items are only left in dpkg for compatibility - they are
  140. understood by it, but never written out in this form.
  141. <p>
  142. Please see the dpkg source code, <tt>lib/parshelp.c</tt>,
  143. <em>statusinfos</>, <em>eflaginfos</> and <em>wantinfos</> for more
  144. details.
  145. <item>info - this directory contains files from the control archive of every
  146. package currently installed. They are installed with a prefix of "&lt;packagename&gt;.".
  147. In addition to this, it also contains a file called &lt;package&gt;.list for every
  148. package, which contains a list of files. Note also that the control file is
  149. not copied into here; it is instead found as part of status or available.
  150. <item>methods - this directory is reserved for "method"-specific files - each
  151. "method" has a subdirectory underneath this directory (or at least, it can
  152. have). In addition, there is another subdirectory "mnt", where misc.
  153. filesystems (floppies, CD-ROMs, etc.) are mounted.
  154. <item>alternatives - directory used by the "update-alternatives" program. It
  155. contains one file for each "alternatives" interface, which contains information
  156. about all the needed symlinked files for each alternative.
  157. <item>diversions - file used by the "dpkg-divert" program. Each diversion takes
  158. three lines. The first is the package name (or ":" for user diversion), the
  159. second the original filename, and the third the diverted filename.
  160. <item>updates - directory used internally by dpkg. This is discussed later,
  161. in the section <ref id="updates">.
  162. <item>parts - temporary directory used by dpkg-split
  163. </list>
  164. <sect>The dpkg library files
  165. <p>
  166. These files are installed under /usr/lib/dpkg (usually), but
  167. /usr/local/lib/dpkg is also a possibility (as Debian policy dictates). Under
  168. this directory, there is a "methods" subdirectory. The methods subdirectory
  169. in turn contains any number of subdirectories for each general method
  170. processor (note that one set of method scripts can, and is, used for more than
  171. one of the methods listed under dselect).
  172. <p>
  173. The following files may be found in each of these subdirectories:-
  174. <list>
  175. <item>names - One line per method, two-digit priority to appear on menu
  176. at beginning, followed by a space, the name, and then another space and the
  177. short description.
  178. <item>desc.&lt;name&gt; - Contains the long description displayed by dselect
  179. when the cursor is put over the &lt;name&gt; method.
  180. <item>setup - Script or program which sets up the initial values to be used
  181. by this method. Called with first argument as the status area directory
  182. (/var/lib/dpkg), second argument as the name of the method (as in the directory
  183. name), and the third argument as the option (as in the names file).
  184. <item>install - Script/program called when the "install" option of dselect is
  185. run with this method. Same arguments as for setup.
  186. <item>update - Script/program called when the "update" option of dselect is
  187. run. Same arguments as for setup/install.
  188. </list>
  189. <sect>The "dpkg" command-line utility
  190. <sect1>"Documented" command-line interfaces
  191. <p>
  192. As yet unwritten. You can refer to the other manuals for now. See
  193. <manref name="dpkg" section="8">.
  194. <sect1>Environment variables which dpkg responds to
  195. <p>
  196. <list>
  197. <item>DPKG_NO_TSTP - if set to a non-null value, this variable causes dpkg to
  198. run a child shell process instead of sending itself a SIGTSTP, when the user
  199. selects to background the dpkg process when it asks about conffiles.
  200. <item>SHELL - used to determine which shell to run in the case when
  201. DPKG_NO_TSTP is set.
  202. <item>CC - used as the C compiler to call to determine the target architecture.
  203. The default is "gcc".
  204. <item>PATH - dpkg checks that it can find at least the following files in the
  205. path when it wants to run package installation scripts, and gives an error if
  206. it cannot find all of them:-
  207. <list>
  208. <item>ldconfig
  209. <item>start-stop-daemon
  210. <item>install-info
  211. <item>update-rc.d
  212. </list>
  213. </list>
  214. <sect1>Assertions
  215. <p>
  216. The dpkg utility itself is required for quite a number of packages, even if
  217. they have been installed with a tool totally separate from dpkg. The reason for
  218. this is that some packages, in their pre-installation scripts, check that your
  219. version of dpkg supports certain features. This was broken from the start, and
  220. it should have actually been a control file header "Dpkg-requires", or similar.
  221. What happens is that the configuration scripts will abort or continue according
  222. to the exit code of a call to dpkg, which will stop them from being wrongly
  223. configured.
  224. <p>
  225. These special command-line options, which simply return as true or false are
  226. all prefixed with "--assert-". Here is a list of them (without the prefix):-
  227. <list>
  228. <item>support-predepends - Returns success or failure according to whether
  229. a version of dpkg which supports predepends properly (1.1.0 or above) is
  230. installed, according to the database.
  231. <item>working-epoch - Return success or failure according to whether a version
  232. of dpkg which supports epochs in version properly (1.4.0.7 or above) is
  233. installed, according to the database.
  234. </list>
  235. <p>
  236. Both these options check the status database to see what version of the "dpkg"
  237. package is installed, and check it against a known working version.
  238. <sect1>--predep-package
  239. <p>
  240. This strange option is described as follows in the source code:
  241. <example>
  242. /* Print a single package which:
  243. * (a) is the target of one or more relevant predependencies.
  244. * (b) has itself no unsatisfied pre-dependencies.
  245. * If such a package is present output is the Packages file entry,
  246. * which can be massaged as appropriate.
  247. * Exit status:
  248. * 0 = a package printed, OK
  249. * 1 = no suitable package available
  250. * 2 = error
  251. */
  252. </example>
  253. <p>
  254. On further inspection of the source code, it appears that what is does is
  255. this:-
  256. <list>
  257. <item>Looks at the packages in the database which are selected as "install",
  258. and are installed.
  259. <item>It then looks at the Pre-Depends information for each of these packages
  260. from the available file. When it find a package for which any of the
  261. pre-dependencies are not satisfied, it breaks from the loop through the packages.
  262. <item>It then looks through the unsatisfied pre-dependencies, and looks for
  263. packages which would satisfy this pre-dependency, stopping on the first it
  264. finds. If it finds none, it bombs out with an error.
  265. <item>It then continues this for every dependency of the initial package.
  266. </list>
  267. Eventually, it writes out the record of all the packages to satisfy the
  268. pre-dependencies. This is used by the disk method to make sure that its
  269. dependency ordering is correct. What happens is that all pre-depending
  270. packages are first installed, then it runs dpkg -iGROEB on the directory,
  271. which installs in the order package files are found. Since pre-dependencies
  272. mean that a package may not even be unpacked unless they are satisfied, it is
  273. necessary to do this (usually, since all the package files are unpacked in one
  274. phase, the configured in another, this is not needed).
  275. <chapt>dpkg-deb and .deb file internals
  276. <p>
  277. This chapter describes the internals to the "dpkg-deb" tool, which is used
  278. by "dpkg" as a back-end. dpkg-deb has its own tar extraction functions, which
  279. is the source of many problems, as it does not support long filenames, using
  280. extension blocks.
  281. <sect>The .deb archive format
  282. <p>
  283. The main principal of the new-format Debian archive (I won't describe the old
  284. format - for that have a look at deb-old.5), is that the archive really is
  285. an archive - as used by "ar" and friends. However, dpkg-deb uses this format
  286. internally, rather than calling "ar". Inside this archive, there are usually
  287. the following members:-
  288. <list>
  289. <item>debian-binary
  290. <item>control.tar.gz
  291. <item>data.tar.gz
  292. </list>
  293. <p>
  294. The debian-binary member consists simply of the string "2.0", indicating the
  295. format version. control.tar.gz contains the control files (and scripts), and
  296. the data.tar.gz contains the actual files to populate the filesystem with.
  297. Both tarfiles extract straight into the current directory. Information on the
  298. tar formats can be found in the GNU tar info page. Since dpkg-deb calls
  299. "tar -cf" to build packages, the Debian packages use the GNU extensions.
  300. <sect>The dpkg-deb command-line
  301. <p>
  302. dpkg-deb documents itself thoroughly with its '--help' command-line option.
  303. However, I am including a reference to these for completeness. dpkg-deb
  304. supports the following options:-
  305. <list>
  306. <item>--build (-b) &lt;dir&gt; - builds a .deb archive, takes a directory which
  307. contains all the files as an argument. Note that the directory
  308. &lt;dir&gt;/DEBIAN will be packed separately into the control archive.
  309. <item>--contents (-c) &lt;debfile&gt; - Lists the contents of the "data.tar.gz"
  310. member.
  311. <item>--control (-e) &lt;debfile&gt; - Extracts the control archive into a
  312. directory called DEBIAN. Alternatively, with another argument, it will extract
  313. it into a different directory.
  314. <item>--info (-I) &lt;debfile&gt; - Prints the contents of the "control" file
  315. in the control archive to stdout. Alternatively, giving it other arguments will
  316. cause it to print the contents of those files instead.
  317. <item>--field (-f) &lt;debfile&gt; &lt;field&gt; ... - Prints any number of
  318. fields from the "control" file. Giving it extra arguments limits the fields it
  319. prints to only those specified. With no command-line arguments other than a
  320. filename, it is equivalent to -I and just the .deb filename.
  321. <item>--extract (-x) &lt;debfile&gt; &lt;dir&gt; - Extracts the data archive
  322. of a debian package under the directory &lt;dir&gt;.
  323. <item>--vextract (-X) &lt;debfile&gt; &lt;dir&gt; - Same as --extract, except
  324. it is equivalent of giving tar the '-v' option - it prints the filenames as
  325. it extracts them.
  326. <item>--fsys-tarfile &lt;debfile&gt; - This option outputs a gunzip'd version
  327. of data.tar.gz to stdout.
  328. <item>--new - sets the archive format to be used to the new Debian format
  329. <item>--old - sets the archive format to be used to the old Debian format
  330. <item>--debug - Tells dpkg-deb to produce debugging output
  331. <item>--nocheck - Tells dpkg-deb not to check the sanity of the control file
  332. <item>--help (-h) - Gives a help message
  333. <item>--version - Shows the version number
  334. <item>--licence/--license (UK/US spellings) - Shows a brief outline of the GPL
  335. </list>
  336. <sect1>Internal checks used by dpkg-deb when building packages
  337. <p>
  338. Here is a list of the internal checks used by dpkg-deb when building packages.
  339. It is in the order they are done.
  340. <list>
  341. <item>First, the output Debian archive argument, if it is given, is checked
  342. using stat. If it is a directory, an internal flag is set. This check is only
  343. made if the archive name is specified explicitly on the command-line. If the
  344. argument was not given, the default is the directory name, with ".deb"
  345. appended.
  346. <item>Next, the control file is checked, unless the --nocheck flag was
  347. specified on the command-line. dpkg-deb will bomb out if the second argument
  348. to --build was a directory, and --nocheck was specified. Note that dpkg-deb
  349. will not be able to determine the name of the package in this case. In the
  350. control file, the following things are checked:-
  351. <list>
  352. <item>The package name is checked to see if it contains any invalid
  353. characters (see <ref id="control"> for this).
  354. <item>The priority field is checked to see if it uses standard values,
  355. and user-defined values are warned against. However, note that this
  356. check is now redundant, since the control file no longer contains
  357. the priority - the changes file now does this.
  358. <item>The control file fields are then checked against the standard
  359. list of fields which appear in control files, and any "user-defined"
  360. fields are reported as warnings.
  361. <item>dpkg-deb then checks that the control file contains a valid
  362. version number.
  363. </list>
  364. <item>After this, in the case where a directory was specified to build the
  365. .deb file in, the filename is created as "directory/pkg_ver.deb" or
  366. "directory/pkg_ver_arch.deb", depending on whether the control file contains
  367. an architecture field.
  368. <item>Next, dpkg-deb checks for the &lt;dir&gt;/DEBIAN directory. It complains
  369. if it doesn't exist, or if it has permissions &lt; 0755, or &gt; 0775.
  370. <item>It then checks that all the files in this subdir are either symlinks
  371. or plain files, and have permissions between 0555 and 0775.
  372. <item>The conffiles file is then checked to see if the filenames are too
  373. long. Warnings are produced for each that is. After this, it checks that
  374. the package provides initial copies of each of these conffiles, and that
  375. they are all plain files.
  376. </list>
  377. <chapt>dpkg internals
  378. <p>
  379. This chapter describes the internals of dpkg itself. Although the low-level
  380. formats are quite simple, what dpkg does in certain cases often does not
  381. make sense.
  382. <sect id="updates">Updates
  383. <p>
  384. This describes the /var/lib/dpkg/updates directory. The function of this
  385. directory is somewhat strange, and seems only to be used internally. A function
  386. called cleanupdates is called whenever the database is scanned. This function
  387. in turn uses <manref name="scandir" section="3">, to sort the files in this
  388. directory. Files who names do not consist entirely of digits are discarded.
  389. dpkg also causes a fatal error if any of the filenames are different lengths.
  390. <p>
  391. After having scanned the directory, dpkg in turn parses each file the same way
  392. it parses the status file (they are sorted by the scandir to be in numerical
  393. order). After having done this, it then writes the status information back
  394. to the "status" file, and removes all the "updates" files.
  395. <p>
  396. These files are created internally by dpkg's "checkpoint" function, and are
  397. cleaned up when dpkg exits cleanly.
  398. <p>
  399. Juding by the use of the updates directory I would call it a Journal. Inorder
  400. to efficiently ensure the complete integrity of the status file dpkg will
  401. "checkpoint" or journal all of it's activities in the updates directory. By
  402. merging the contents of the updates directory (in order!!) against the
  403. original status file it can get the precise current state of the system,
  404. even in the event of a system failure while dpkg is running.
  405. <p>
  406. The other option would be to sync-rewrite the status file after each
  407. operation, which would kill performance.
  408. <p>
  409. It is very important that any program that uses the status file abort if
  410. the updates directory is not empty! The user should be informed to run dpkg
  411. manually (what options though??) to correct the situation.
  412. <sect>What happens when dpkg reads the database
  413. <p>
  414. First, the status file is read. This gives dpkg an initial idea of the packages
  415. that are there. Next, the updates files are read in, overriding the status
  416. file, and if necessary, the status file is re-written, and updates files are
  417. removed. Finally, the available file is read. The available file is read
  418. with flags which preclude dpkg from updating any status information from it,
  419. though - installed version, etc., and is also told to record that the packages
  420. it reads this time are available, not installed.
  421. <p>
  422. More information on updates is given above.
  423. <sect>How dpkg compares version numbers
  424. <p>
  425. Version numbers consist of three parts: the epoch, the upstream version, and
  426. the Debian revision. Dpkg compares these parts in that order. If the epochs
  427. are different, it returns immediately, and so on.
  428. <p>
  429. However, the important part is how it compares the versions which are
  430. essentially stored as just strings. These are compared in two distinct parts:
  431. those consisting of numerical characters (which are evaluated, and then
  432. compared), and those consisting of other characters. When comparing
  433. non-numerical parts, they are compared as the character values (ASCII), but
  434. non-alphabetical characters are considered "greater than" alphabetical ones.
  435. Also note that longer strings (after excluding differences where numerical
  436. values are equal) are considered "greater than" shorter ones.
  437. <p>
  438. Here are a few examples of how these rules apply:-
  439. <example>
  440. 15 > 10
  441. 0010 == 10
  442. d.r > dsr
  443. 32.d.r == 0032.d.r
  444. d.rnr < d.rnrn
  445. </example>
  446. </book>