summaryrefslogtreecommitdiffstats
path: root/docs/manual/howto/reverse_proxy.xml
blob: 6bc3c45401ee959da9821b2e12f344316d500843 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
<?xml version='1.0' encoding='UTF-8' ?>
<!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
<!-- $LastChangedRevision$ -->

<!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
-->

<manualpage metafile="reverse_proxy.xml.meta">
<parentdocument href="./">How-To / Tutorials</parentdocument>

  <title>Reverse Proxy Guide</title>

  <summary>
    <p>In addition to being a "basic" web server, and providing static and
    dynamic content to end-users, Apache httpd (as well as most other web
    servers) can also act as a reverse proxy server, also-known-as a
    "gateway" server.</p>

    <p>In such scenarios, httpd itself does not generate or host the data,
    but rather the content is obtained by one or several backend servers,
    which normally have no direct connection to the external network. As
    httpd receives a request from a client, the request itself is <em>proxied</em>
    to one of these backend servers, which then handles the request, generates
    the content and then sends this content back to httpd, which then
    generates the actual HTTP response back to the client.</p>

    <p>There are numerous reasons for such an implementation, but generally
    the typical rationales are due to security, high-availability, load-balancing
    and centralized authentication/authorization. It is critical in these
    implementations that the layout, design and architecture of the backend
    infrastructure (those servers which actually handle the requests) are
    insulated and protected from the outside; as far as the client is concerned,
    the reverse proxy server <em>is</em> the sole source of all content.</p>

    <p>A typical implementation is below:</p>
    <p class="centered"><img src="../images/reverse-proxy-arch.png" alt="reverse-proxy-arch" /></p>

  </summary>


  <section id="related">
  <title>Reverse Proxy</title>
  <related>
    <modulelist>
      <module>mod_proxy</module>
      <module>mod_proxy_balancer</module>
      <module>mod_proxy_hcheck</module>
    </modulelist>
    <directivelist>
      <directive module="mod_proxy">ProxyPass</directive>
      <directive module="mod_proxy">BalancerMember</directive>
    </directivelist>
  </related>
  </section>

  <section id="simple">
    <title>Simple reverse proxying</title>

    <p>
      The <directive module="mod_proxy">ProxyPass</directive>
      directive specifies the mapping of incoming requests to the backend
      server (or a cluster of servers known as a <code>Balancer</code>
      group). The simplest example proxies all requests (<code>"/"</code>)
      to a single backend:
    </p>

    <highlight language="config">
ProxyPass "/"  "http://www.example.com/"
    </highlight>

    <p>
      To ensure that and <code>Location:</code> headers generated from
      the backend are modified to point to the reverse proxy, instead of
      back to itself, the <directive module="mod_proxy">ProxyPassReverse</directive>
      directive is most often required:
    </p>

    <highlight language="config">
ProxyPass "/"  "http://www.example.com/"
ProxyPassReverse "/"  "http://www.example.com/"
    </highlight>

    <p>Only specific URIs can be proxied, as shown in this example:</p>

    <highlight language="config">
ProxyPass "/images"  "http://www.example.com/"
ProxyPassReverse "/images"  "http://www.example.com/"
    </highlight>

    <p>In the above, any requests which start with the <code>/images</code>
      path with be proxied to the specified backend, otherwise it will be handled
      locally.
    </p>
  </section>

  <section id="cluster">
    <title>Clusters and Balancers</title>

    <p>
      As useful as the above is, it still has the deficiencies that should
      the (single) backend node go down, or become heavily loaded, that proxying
      those requests provides no real advantage. What is needed is the ability
      to define a set or group of backend servers which can handle such
      requests and for the reverse proxy to load balance and failover among
      them. This group is sometimes called a <em>cluster</em> but Apache httpd's
      term is a <em>balancer</em>. One defines a balancer by leveraging the
      <directive module="mod_proxy" type="section">Proxy</directive> and
      <directive module="mod_proxy">BalancerMember</directive> directives as
      shown:
    </p>

    <highlight language="config">
&lt;Proxy balancer://myset&gt;
    BalancerMember http://www2.example.com:8080
    BalancerMember http://www3.example.com:8080
    ProxySet lbmethod=bytraffic
&lt;/Proxy&gt;

ProxyPass "/images/"  "balancer://myset/"
ProxyPassReverse "/images/"  "balancer://myset/"
    </highlight>

    <p>
      The <code>balancer://</code> scheme is what tells httpd that we are creating
      a balancer set, with the name <em>myset</em>. It includes 2 backend servers,
      which httpd calls <em>BalancerMembers</em>. In this case, any requests for
      <code>/images</code> will be proxied to <em>one</em> of the 2 backends.
      The <directive module="mod_proxy">ProxySet</directive> directive
      specifies that the <em>myset</em> Balancer use a load balancing algorithm
      that balances based on I/O bytes.
    </p>

    <note type="hint"><title>Hint</title>
      <p>
      	<em>BalancerMembers</em> are also sometimes referred to as <em>workers</em>.
      </p>
   </note>

  </section>

  <section id="config">
    <title>Balancer and BalancerMember configuration</title>

    <p>
      You can adjust numerous configuration details of the <em>balancers</em>
      and the <em>workers</em> via the various parameters defined in
      <directive module="mod_proxy">ProxyPass</directive>. For example,
      assuming we would want <code>http://www3.example.com:8080</code> to
      handle 3x the traffic with a timeout of 1 second, we would adjust the
      configuration as follows:
    </p>

    <highlight language="config">
&lt;Proxy balancer://myset&gt;
    BalancerMember http://www2.example.com:8080
    BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1
    ProxySet lbmethod=bytraffic
&lt;/Proxy&gt;

ProxyPass "/images"  "balancer://myset/"
ProxyPassReverse "/images"  "balancer://myset/"
    </highlight>

  </section>

  <section id="failover">
    <title>Failover</title>

    <p>
      You can also fine-tune various failover scenarios, detailing which workers
      and even which balancers should be accessed in such cases. For example, the
      below setup implements three failover cases:
    </p>
    <ol>
      <li>
        <code>http://spare1.example.com:8080</code> and
        <code>http://spare2.example.com:8080</code> are only sent traffic if one
        or both of <code>http://www2.example.com:8080</code> or
        <code>http://www3.example.com:8080</code> is unavailable. (One spare
        will be used to replace one unusable member of the same balancer set.)
      </li>
      <li>
        <code>http://hstandby.example.com:8080</code> is only sent traffic if
        all other workers in balancer set <code>0</code> are not available.
      </li>
      <li>
        If all load balancer set <code>0</code> workers, spares, and the standby
        are unavailable, only then will the
        <code>http://bkup1.example.com:8080</code> and
        <code>http://bkup2.example.com:8080</code> workers from balancer set
        <code>1</code> be brought into rotation.
      </li>
    </ol>
    <p>
      Thus, it is possible to have one or more hot spares and hot standbys for
      each load balancer set.
    </p>

    <highlight language="config">
&lt;Proxy balancer://myset&gt;
    BalancerMember http://www2.example.com:8080
    BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1
    BalancerMember http://spare1.example.com:8080 status=+R
    BalancerMember http://spare2.example.com:8080 status=+R
    BalancerMember http://hstandby.example.com:8080 status=+H
    BalancerMember http://bkup1.example.com:8080 lbset=1
    BalancerMember http://bkup2.example.com:8080 lbset=1
    ProxySet lbmethod=byrequests
&lt;/Proxy&gt;

ProxyPass "/images/"  "balancer://myset/"
ProxyPassReverse "/images/"  "balancer://myset/"
    </highlight>

    <p>
      For failover, hot spares are used as replacements for unusable workers in
      the same load balancer set. A worker is considered unusable if it is
      draining, stopped, or otherwise in an error/failed state. Hot standbys are
      used if all workers and spares in the load balancer set are
      unavailable. Load balancer sets (with their respective hot spares and
      standbys) are always tried in order from lowest to highest.
    </p>

  </section>

  <section id="manager">
    <title>Balancer Manager</title>

    <p>
      One of the most unique and useful features of Apache httpd's reverse proxy is
	  the embedded <em>balancer-manager</em> application. Similar to
	  <module>mod_status</module>, <em>balancer-manager</em> displays
	  the current working configuration and status of the enabled
	  balancers and workers currently in use. However, not only does it
	  display these parameters, it also allows for dynamic, runtime, on-the-fly
	  reconfiguration of almost all of them, including adding new <em>BalancerMembers</em>
	  (workers) to an existing balancer. To enable these capability, the following
	  needs to be added to your configuration:
    </p>

    <highlight language="config">
&lt;Location "/balancer-manager"&gt;
    SetHandler balancer-manager
    Require host localhost
&lt;/Location&gt;
    </highlight>

    <note type="warning"><title>Warning</title>
      <p>Do not enable the <em>balancer-manager</em> until you have <a
      href="../mod/mod_proxy.html#access">secured your server</a>. In
      particular, ensure that access to the URL is tightly
      restricted.</p>
    </note>

    <p>
      When the reverse proxy server is accessed at that url
      (eg: <code>http://rproxy.example.com/balancer-manager/</code>, you will see a
      page similar to the below:
    </p>
    <p class="centered"><img src="../images/bal-man.png" alt="balancer-manager page" /></p>

    <p>
      This form allows the devops admin to adjust various parameters, take
      workers offline, change load balancing methods and add new works. For
      example, clicking on the balancer itself, you will get the following page:
    </p>
    <p class="centered"><img src="../images/bal-man-b.png" alt="balancer-manager page" /></p>

    <p>
      Whereas clicking on a worker, displays this page:
    </p>
    <p class="centered"><img src="../images/bal-man-w.png" alt="balancer-manager page" /></p>

    <p>
      To have these changes persist restarts of the reverse proxy, ensure that
      <directive module="mod_proxy">BalancerPersist</directive> is enabled.
    </p>

  </section>

  <section id="health-check">
    <title>Dynamic Health Checks</title>

    <p>
      Before httpd proxies a request to a worker, it can <em>"test"</em> if that worker
      is available via setting the <code>ping</code> parameter for that worker using
      <directive module="mod_proxy">ProxyPass</directive>. Oftentimes it is
      more useful to check the health of the workers <em>out of band</em>, in a
      dynamic fashion. This is achieved in Apache httpd by the
      <module>mod_proxy_hcheck</module> module.
    </p>

  </section>

  <section id="status">
    <title>BalancerMember status flags</title>

    <p>
      In the <em>balancer-manager</em> the current state, or <em>status</em>, of a worker
      is displayed and can be set/reset. The meanings of these statuses are as follows:
    </p>
      <table border="1">
      	<tr><th>Flag</th><th>String</th><th>Description</th></tr>
      	<tr><td>&nbsp;</td><td><em>Ok</em></td><td>Worker is available</td></tr>
      	<tr><td>&nbsp;</td><td><em>Init</em></td><td>Worker has been initialized</td></tr>
        <tr><td><code>D</code></td><td><em>Dis</em></td><td>Worker is disabled and will not accept any requests; will be
                    automatically retried.</td></tr>
        <tr><td><code>S</code></td><td><em>Stop</em></td><td>Worker is administratively stopped; will not accept requests
                    and will not be automatically retried</td></tr>
        <tr><td><code>I</code></td><td><em>Ign</em></td><td>Worker is in ignore-errors mode and will always be considered available.</td></tr>
        <tr><td><code>R</code></td><td><em>Spar</em></td><td>Worker is a hot spare. For each worker in a given lbset that is unusable
                    (draining, stopped, in error, etc.), a usable hot spare with the same lbset will be used in
                    its place. Hot spares can help ensure that a specific number of workers are always available
                    for use by a balancer.</td></tr>
        <tr><td><code>H</code></td><td><em>Stby</em></td><td>Worker is in hot-standby mode and will only be used if no other
                    viable workers or spares are available in the balancer set.</td></tr>
        <tr><td><code>E</code></td><td><em>Err</em></td><td>Worker is in an error state, usually due to failing pre-request check;
                    requests will not be proxied to this worker, but it will be retried depending on
                    the <code>retry</code> setting of the worker.</td></tr>
        <tr><td><code>N</code></td><td><em>Drn</em></td><td>Worker is in drain mode and will only accept existing sticky sessions
                    destined for itself and ignore all other requests.</td></tr>
        <tr><td><code>C</code></td><td><em>HcFl</em></td><td>Worker has failed dynamic health check and will not be used until it
                    passes subsequent health checks.</td></tr>
      </table>
  </section>

</manualpage>