summaryrefslogtreecommitdiffstats
path: root/docs/manual/dso.xml
blob: aebb3fa0cc7d1085d4214e9618f57b5e7f5471b8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
<?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>

<manualpage metafile="dso.xml.meta">

  <title>Dynamic Shared Object (DSO) Support</title>

  <summary>
    <p>The Apache HTTP Server is a modular program where the
    administrator can choose the functionality to include in the
    server by selecting a set of modules. The modules can be
    statically compiled into the <code>httpd</code> binary when the
    server is built. Alternatively, modules can be compiled as
    Dynamic Shared Objects (DSOs) that exist separately from the
    main <code>httpd</code> binary file. DSO modules may be
    compiled at the time the server is built, or they may be
    compiled and added at a later time using the Apache Extension
    Tool (<a href="programs/apxs.html">apxs</a>).</p>

    <p>This document describes how to use DSO modules as well as
    the theory behind their use.</p>
  </summary>


<section id="implementation"><title>Implementation</title>

<related>
<modulelist>
<module>mod_so</module>
</modulelist>
<directivelist>
<directive module="mod_so">LoadModule</directive>
</directivelist>
</related>

    <p>The DSO support for loading individual Apache modules is based
    on a module named <module>mod_so</module> which must be statically
    compiled into the Apache core. It is the only module besides
    <module>core</module> which cannot be put into a DSO
    itself. Practically all other distributed Apache modules can then
    be placed into a DSO by individually enabling the DSO build for
    them via <code>configure</code>'s
    <code>--enable-<em>module</em>=shared</code> option as discussed
    in the <a href="install.html">install documentation</a>. After a
    module is compiled into a DSO named <code>mod_foo.so</code> you
    can use <module>mod_so</module>'s <directive
    module="mod_so">LoadModule</directive> command in your
    <code>httpd.conf</code> file to load this module at server startup
    or restart.</p>

    <p>To simplify this creation of DSO files for Apache modules
    (especially for third-party modules) a new support program
    named <a href="programs/apxs.html">apxs</a> (<em>APache
    eXtenSion</em>) is available. It can be used to build DSO based
    modules <em>outside of</em> the Apache source tree. The idea is
    simple: When installing Apache the <code>configure</code>'s
    <code>make install</code> procedure installs the Apache C
    header files and puts the platform-dependent compiler and
    linker flags for building DSO files into the <code>apxs</code>
    program. This way the user can use <code>apxs</code> to compile
    his Apache module sources without the Apache distribution
    source tree and without having to fiddle with the
    platform-dependent compiler and linker flags for DSO
    support.</p>
</section>

<section id="usage"><title>Usage Summary</title>

    <p>To give you an overview of the DSO features of Apache 2.0,
    here is a short and concise summary:</p>

    <ol>
      <li>
        Build and install a <em>distributed</em> Apache module, say
        <code>mod_foo.c</code>, into its own DSO
        <code>mod_foo.so</code>: 

<example>
$ ./configure --prefix=/path/to/install --enable-foo=shared<br />
$ make install
</example>
      </li>

      <li>
        Build and install a <em>third-party</em> Apache module, say
        <code>mod_foo.c</code>, into its own DSO
        <code>mod_foo.so</code>: 

<example>
$ ./configure --add-module=module_type:/path/to/3rdparty/mod_foo.c --enable-foo=shared<br />
$ make install
</example>
      </li>

      <li>
        Configure Apache for <em>later installation</em> of shared
        modules: 

<example>
$ ./configure --enable-so<br />
$ make install
</example>
      </li>

      <li>
        Build and install a <em>third-party</em> Apache module, say
        <code>mod_foo.c</code>, into its own DSO
        <code>mod_foo.so</code> <em>outside of</em> the Apache
        source tree using <a href="programs/apxs.html">apxs</a>: 

<example>
$ cd /path/to/3rdparty<br />
$ apxs -c mod_foo.c<br />
$ apxs -i -a -n foo mod_foo.la
</example>
      </li>
    </ol>

    <p>In all cases, once the shared module is compiled, you must
    use a <directive module="mod_so">LoadModule</directive>
    directive in <code>httpd.conf</code> to tell Apache to activate
    the module.</p>
</section>

<section id="background"><title>Background</title>

    <p>On modern Unix derivatives there exists a nifty mechanism
    usually called dynamic linking/loading of <em>Dynamic Shared
    Objects</em> (DSO) which provides a way to build a piece of
    program code in a special format for loading it at run-time
    into the address space of an executable program.</p>

    <p>This loading can usually be done in two ways: Automatically
    by a system program called <code>ld.so</code> when an
    executable program is started or manually from within the
    executing program via a programmatic system interface to the
    Unix loader through the system calls
    <code>dlopen()/dlsym()</code>.</p>

    <p>In the first way the DSO's are usually called <em>shared
    libraries</em> or <em>DSO libraries</em> and named
    <code>libfoo.so</code> or <code>libfoo.so.1.2</code>. They
    reside in a system directory (usually <code>/usr/lib</code>)
    and the link to the executable program is established at
    build-time by specifying <code>-lfoo</code> to the linker
    command. This hard-codes library references into the executable
    program file so that at start-time the Unix loader is able to
    locate <code>libfoo.so</code> in <code>/usr/lib</code>, in
    paths hard-coded via linker-options like <code>-R</code> or in
    paths configured via the environment variable
    <code>LD_LIBRARY_PATH</code>. It then resolves any (yet
    unresolved) symbols in the executable program which are
    available in the DSO.</p>

    <p>Symbols in the executable program are usually not referenced
    by the DSO (because it's a reusable library of general code)
    and hence no further resolving has to be done. The executable
    program has no need to do anything on its own to use the
    symbols from the DSO because the complete resolving is done by
    the Unix loader. (In fact, the code to invoke
    <code>ld.so</code> is part of the run-time startup code which
    is linked into every executable program which has been bound
    non-static). The advantage of dynamic loading of common library
    code is obvious: the library code needs to be stored only once,
    in a system library like <code>libc.so</code>, saving disk
    space for every program.</p>

    <p>In the second way the DSO's are usually called <em>shared
    objects</em> or <em>DSO files</em> and can be named with an
    arbitrary extension (although the canonical name is
    <code>foo.so</code>). These files usually stay inside a
    program-specific directory and there is no automatically
    established link to the executable program where they are used.
    Instead the executable program manually loads the DSO at
    run-time into its address space via <code>dlopen()</code>. At
    this time no resolving of symbols from the DSO for the
    executable program is done. But instead the Unix loader
    automatically resolves any (yet unresolved) symbols in the DSO
    from the set of symbols exported by the executable program and
    its already loaded DSO libraries (especially all symbols from
    the ubiquitous <code>libc.so</code>). This way the DSO gets
    knowledge of the executable program's symbol set as if it had
    been statically linked with it in the first place.</p>

    <p>Finally, to take advantage of the DSO's API the executable
    program has to resolve particular symbols from the DSO via
    <code>dlsym()</code> for later use inside dispatch tables
    <em>etc.</em> In other words: The executable program has to
    manually resolve every symbol it needs to be able to use it.
    The advantage of such a mechanism is that optional program
    parts need not be loaded (and thus do not spend memory) until
    they are needed by the program in question. When required,
    these program parts can be loaded dynamically to extend the
    base program's functionality.</p>

    <p>Although this DSO mechanism sounds straightforward there is
    at least one difficult step here: The resolving of symbols from
    the executable program for the DSO when using a DSO to extend a
    program (the second way). Why? Because "reverse resolving" DSO
    symbols from the executable program's symbol set is against the
    library design (where the library has no knowledge about the
    programs it is used by) and is neither available under all
    platforms nor standardized. In practice the executable
    program's global symbols are often not re-exported and thus not
    available for use in a DSO. Finding a way to force the linker
    to export all global symbols is the main problem one has to
    solve when using DSO for extending a program at run-time.</p>

    <p>The shared library approach is the typical one, because it
    is what the DSO mechanism was designed for, hence it is used
    for nearly all types of libraries the operating system
    provides. On the other hand using shared objects for extending
    a program is not used by a lot of programs.</p>

    <p>As of 1998 there are only a few software packages available
    which use the DSO mechanism to actually extend their
    functionality at run-time: Perl 5 (via its XS mechanism and the
    DynaLoader module), Netscape Server, <em>etc.</em> Starting
    with version 1.3, Apache joined the crew, because Apache
    already uses a module concept to extend its functionality and
    internally uses a dispatch-list-based approach to link external
    modules into the Apache core functionality. So, Apache is
    really predestined for using DSO to load its modules at
    run-time.</p>
</section>

<section id="advantages"><title>Advantages and Disadvantages</title>

    <p>The above DSO based features have the following
    advantages:</p>

    <ul>
      <li>The server package is more flexible at run-time because
      the actual server process can be assembled at run-time via
      <directive module="mod_so">LoadModule</directive>
      <code>httpd.conf</code> configuration commands instead of
      <code>configure</code> options at build-time. For instance
      this way one is able to run different server instances
      (standard &amp; SSL version, minimalistic &amp; powered up
      version [mod_perl, PHP3], <em>etc.</em>) with only one Apache
      installation.</li>

      <li>The server package can be easily extended with
      third-party modules even after installation. This is at least
      a great benefit for vendor package maintainers who can create
      a Apache core package and additional packages containing
      extensions like PHP3, mod_perl, mod_fastcgi,
      <em>etc.</em></li>

      <li>Easier Apache module prototyping because with the
      DSO/<code>apxs</code> pair you can both work outside the
      Apache source tree and only need an <code>apxs -i</code>
      command followed by an <code>apachectl restart</code> to
      bring a new version of your currently developed module into
      the running Apache server.</li>
    </ul>

    <p>DSO has the following disadvantages:</p>

    <ul>
      <li>The DSO mechanism cannot be used on every platform
      because not all operating systems support dynamic loading of
      code into the address space of a program.</li>

      <li>The server is approximately 20% slower at startup time
      because of the symbol resolving overhead the Unix loader now
      has to do.</li>

      <li>The server is approximately 5% slower at execution time
      under some platforms because position independent code (PIC)
      sometimes needs complicated assembler tricks for relative
      addressing which are not necessarily as fast as absolute
      addressing.</li>

      <li>Because DSO modules cannot be linked against other
      DSO-based libraries (<code>ld -lfoo</code>) on all platforms
      (for instance a.out-based platforms usually don't provide
      this functionality while ELF-based platforms do) you cannot
      use the DSO mechanism for all types of modules. Or in other
      words, modules compiled as DSO files are restricted to only
      use symbols from the Apache core, from the C library
      (<code>libc</code>) and all other dynamic or static libraries
      used by the Apache core, or from static library archives
      (<code>libfoo.a</code>) containing position independent code.
      The only chances to use other code is to either make sure the
      Apache core itself already contains a reference to it or
      loading the code yourself via <code>dlopen()</code>.</li>
    </ul>

</section>

</manualpage>