| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lets use a proper structure to clearly document and implement
skb fast clones.
Then, we might experiment more easily alternative layouts.
This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm,
to stop leaking of implementation details.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have two different policies for orphan sockets
that repeatedly stall on zero window ACKs. If a socket gets
a zero window ACK when it is transmitting data, the RTO is
used to probe the window. The socket is aborted after roughly
tcp_orphan_retries() retries (as in tcp_write_timeout()).
But if the socket was idle when it received the zero window ACK,
and later wants to send more data, we use the probe timer to
probe the window. If the receiver always returns zero window ACKs,
icsk_probes keeps getting reset in tcp_ack() and the orphan socket
can stall forever until the system reaches the orphan limit (as
commented in tcp_probe_timer()). This opens up a simple attack
to create lots of hanging orphan sockets to burn the memory
and the CPU, as demonstrated in the recent netdev post "TCP
connection will hang in FIN_WAIT1 after closing if zero window is
advertised." http://www.spinics.net/lists/netdev/msg296539.html
This patch follows the design in RTO-based probe: we abort an orphan
socket stalling on zero window when the probe timer reaches both
the maximum backoff and the maximum RTO. For example, an 100ms RTT
connection will timeout after roughly 153 seconds (0.3 + 0.6 +
.... + 76.8) if the receiver keeps the window shut. If the orphan
socket passes this check, but the system already has too many orphans
(as in tcp_out_of_resources()), we still abort it but we'll also
send an RST packet as the connection may still be active.
In addition, we change TCP_USER_TIMEOUT to cover (life or dead)
sockets stalled on zero-window probes. This changes the semantics
of TCP_USER_TIMEOUT slightly because it previously only applies
when the socket has pending transmission.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
cipso_v4_cache_init is only called by __init cipso_v4_init
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
ip4_frags_ctl_register is only called by __init ipfrag_init
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
tcp_init_mem is only called by __init tcp_init.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
| |
The dsa_switch_suspend() and dsa_switch_resume() functions are only used
when PM_SLEEP is enabled, so they need #ifdef CONFIG_PM_SLEEP protection
to avoid a compiler warning.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Proper CHECKSUM_COMPLETE support needs to adjust skb->csum
when we remove one header. Its done using skb_gro_postpull_rcsum()
In the case of IPv4, we know that the adjustment is not really needed,
because the checksum over IPv4 header is 0. Lets add a comment to
ease code comprehension and avoid copy/paste errors.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3243acd37fd9
("ieee802154: add __init to lowpan_frags_sysctl_register")
added __init to lowpan_frags_ns_sysctl_register instead of
lowpan_frags_sysctl_register
Suggested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
No caller uses the return value, so make this function return void.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
lowpan_frags_sysctl_register is only called by __init lowpan_net_frag_init
(part of the lowpan module).
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
irlan_open is only called by __init irlan_init in same module.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eric reports build failure with
CONFIG_BRIDGE_NETFILTER=n
We insist to build br_nf_core.o unconditionally, but we must only do so
if br_netfilter was enabled, else it fails to build due to
functions being defined to empty stubs (and some structure members
being defined out).
Also, BRIDGE_NETFILTER=y|m makes no sense when BRIDGE=n.
Fixes: 34666d467 (netfilter: bridge: move br_netfilter out of the core)
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
After previous patches to simplify qstats the qstats can be
made per cpu with a packed union in Qdisc struct.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the use of qstats->qlen variable from the classifiers
and makes it an explicit argument to gnet_stats_copy_queue().
The qlen represents the qdisc queue length and is packed into
the qstats at the last moment before passnig to user space. By
handling it explicitely we avoid, in the percpu stats case, having
to figure out which per_cpu variable to put it in.
It would probably be best to remove it from qstats completely
but qstats is a user space ABI and can't be broken. A future
patch could make an internal only qstats structure that would
avoid having to allocate an additional u32 variable on the
Qdisc struct. This would make the qstats struct 128bits instead
of 128+32.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
This adds helpers to manipulate qstats logic and replaces locations
that touch the counters directly. This simplifies future patches
to push qstats onto per cpu counters.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to run qdisc's without locking statistics and estimators
need to be handled correctly.
To resolve bstats make the statistics per cpu. And because this is
only needed for qdiscs that are running without locks which is not
the case for most qdiscs in the near future only create percpu
stats when qdiscs set the TCQ_F_CPUSTATS flag.
Next because estimators use the bstats to calculate packets per
second and bytes per second the estimator code paths are updated
to use the per cpu statistics.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pablo Neira Ayuso says:
====================
pull request: netfilter/ipvs updates for net-next
The following patchset contains Netfilter/IPVS updates for net-next,
most relevantly they are:
1) Four patches to make the new nf_tables masquerading support
independent of the x_tables infrastructure. This also resolves a
compilation breakage if the masquerade target is disabled but the
nf_tables masq expression is enabled.
2) ipset updates via Jozsef Kadlecsik. This includes the addition of the
skbinfo extension that allows you to store packet metainformation in the
elements. This can be used to fetch and restore this to the packets through
the iptables SET target, patches from Anton Danilov.
3) Add the hash:mac set type to ipset, from Jozsef Kadlecsick.
4) Add simple weighted fail-over scheduler via Simon Horman. This provides
a fail-over IPVS scheduler (unlike existing load balancing schedulers).
Connections are directed to the appropriate server based solely on
highest weight value and server availability, patch from Kenny Mathis.
5) Support IPv6 real servers in IPv4 virtual-services and vice versa.
Simon Horman informs that the motivation for this is to allow more
flexibility in the choice of IP version offered by both virtual-servers
and real-servers as they no longer need to match: An IPv4 connection
from an end-user may be forwarded to a real-server using IPv6 and
vice versa. No ip_vs_sync support yet though. Patches from Alex Gartrell
and Julian Anastasov.
6) Add global generation ID to the nf_tables ruleset. When dumping from
several different object lists, we need a way to identify that an update
has ocurred so userspace knows that it needs to refresh its lists. This
also includes a new command to obtain the 32-bits generation ID. The
less significant 16-bits of this ID is also exposed through res_id field
in the nfnetlink header to quickly detect the interference and retry when
there is no risk of ID wraparound.
7) Move br_netfilter out of the bridge core. The br_netfilter code is
built in the bridge core by default. This causes problems of different
kind to people that don't want this: Jesper reported performance drop due
to the inconditional hook registration and I remember to have read complains
on netdev from people regarding the unexpected behaviour of our bridging
stack when br_netfilter is enabled (fragmentation handling, layer 3 and
upper inspection). People that still need this should easily undo the
damage by modprobing the new br_netfilter module.
8) Dump the set policy nf_tables that allows set parameterization. So
userspace can keep user-defined preferences when saving the ruleset.
From Arturo Borrero.
9) Use __seq_open_private() helper function to reduce boiler plate code
in x_tables, From Rob Jones.
10) Safer default behaviour in case that you forget to load the protocol
tracker. Daniel Borkmann and Florian Westphal detected that if your
ruleset is stateful, you allow traffic to at least one single SCTP port
and the SCTP protocol tracker is not loaded, then any SCTP traffic may
be pass through unfiltered. After this patch, the connection tracking
classifies SCTP/DCCP/UDPlite/GRE packets as invalid if your kernel has
been compiled with support for these modules.
====================
Trivially resolved conflict in include/linux/skbuff.h, Eric moved some
netfilter skbuff members around, and the netfilter tree adjusted the
ifdef guards for the bridging info pointer.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Given following iptables ruleset:
-P FORWARD DROP
-A FORWARD -m sctp --dport 9 -j ACCEPT
-A FORWARD -p tcp --dport 80 -j ACCEPT
-A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT
One would assume that this allows SCTP on port 9 and TCP on port 80.
Unfortunately, if the SCTP conntrack module is not loaded, this allows
*all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
which we think is a security issue.
This is because on the first SCTP packet on port 9, we create a dummy
"generic l4" conntrack entry without any port information (since
conntrack doesn't know how to extract this information).
All subsequent packets that are unknown will then be in established
state since they will fallback to proto_generic and will match the
'generic' entry.
Our originally proposed version [1] completely disabled generic protocol
tracking, but Jozsef suggests to not track protocols for which a more
suitable helper is available, hence we now mitigate the issue for in
tree known ct protocol helpers only, so that at least NAT and direction
information will still be preserved for others.
[1] http://www.spinics.net/lists/netfilter-devel/msg33430.html
Joint work with Daniel Borkmann.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
We want to know in which cases the user explicitly sets the policy
options. In that case, we also want to dump back the info.
Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Jesper reported that br_netfilter always registers the hooks since
this is part of the bridge core. This harms performance for people that
don't need this.
This patch modularizes br_netfilter so it can be rmmod'ed, thus,
the hooks can be unregistered. I think the bridge netfilter should have
been a separated module since the beginning, Patrick agreed on that.
Note that this is breaking compatibility for users that expect that
bridge netfilter is going to be available after explicitly 'modprobe
bridge' or via automatic load through brctl.
However, the damage can be easily undone by modprobing br_netfilter.
The bridge core also spots a message to provide a clue to people that
didn't notice that this has been deprecated.
On top of that, the plan is that nftables will not rely on this software
layer, but integrate the connection tracking into the bridge layer to
enable stateful filtering and NAT, which is was bridge netfilter users
seem to require.
This patch still keeps the fake_dst_ops in the bridge core, since this
is required by when the bridge port is initialized. So we can safely
modprobe/rmmod br_netfilter anytime.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Florian Westphal <fw@strlen.de>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Move nf_bridge_copy_header() as static inline in netfilter_bridge.h
header file. This patch prepares the modularization of the br_netfilter
code.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Reduce boilerplate code by using __seq_open_private() instead of seq_open()
in xt_match_open() and xt_target_open().
Signed-off-by: Rob Jones <rob.jones@codethink.co.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch exposes the ruleset generation ID in three ways:
1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset
generation ID. This ID is incremented in every commit and it
should be large enough to avoid wraparound problems.
2) The less significant 16-bits of the generation ID are exposed through
the nfgenmsg->res_id header field. This allows us to quickly catch
if the ruleset has change between two consecutive list dumps from
different object lists (in this specific case I think the risk of
wraparound is unlikely).
3) Userspace subscribers may receive notifications of new rule-set
generation after every commit. This also provides an alternative
way to monitor the generation ID. If the events are lost, the
userspace process hits a overrun error, so it knows that it is
working with a stale ruleset anyway.
Patrick spotted that rule-set transformations in userspace may take
quite some time. In that case, it annotates the 32-bits generation ID
before fetching the rule-set, then:
1) it compares it to what we obtain after the transformation to
make sure it is not working with a stale rule-set and no wraparound
has ocurred.
2) it subscribes to ruleset notifications, so it can watch for new
generation ID.
This is complementary to the NLM_F_DUMP_INTR approach, which allows
us to detect an interference in the middle one single list dumping.
There is no way to explicitly check that an interference has occurred
between two list dumps from the kernel, since it doesn't know how
many lists the userspace client is actually going to dump.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| | |
This allows us to access the original content of the batch from
the commit and the abort paths.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Simon Horman says:
====================
This pull requests makes the following changes:
* Add simple weighted fail-over scheduler.
- Unlike other IPVS schedulers this offers fail-over rather than load
balancing. Connections are directed to the appropriate server based
solely on highest weight value and server availability.
- Thanks to Kenny Mathis
* Support IPv6 real servers in IPv4 virtual-services and vice versa
- This feature is supported in conjunction with the tunnel (IPIP)
forwarding mechanism. That is, IPv4 may be forwarded in IPv6 and
vice versa.
- The motivation for this is to allow more flexibility in the
choice of IP version offered by both virtual-servers and
real-servers as they no longer need to match: An IPv4 connection from an
end-user may be forwarded to a real-server using IPv6 and vice versa.
- Further work need to be done to support this feature in conjunction
with connection synchronisation. For now such configurations are
not allowed.
- This change includes update to netlink protocol, adding a new
destination address family attribute. And the necessary changes
to plumb this information throughout IPVS.
- Thanks to Alex Gartrell and Julian Anastasov
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Remove the temporary consistency check and add a case statement to only
allow ipip mixed dests.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use the new address family field cp->daf when printing
cp->daddr in logs or connection listing.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Needed to support svc->af != dest->af.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The LBLCR entries should use svc->af, not dest->af.
Needed to support svc->af != dest->af.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The LBLC entries should use svc->af, not dest->af.
Needed to support svc->af != dest->af.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull the common logic for preparing an skb to prepend the header into a
single function and then set fields such that they can be used in either
case (generalize tos and tclass to dscp, hop_limit and ttl to ttl, etc)
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The out_rt functions check to see if the mtu is large enough for the packet
and, if not, send icmp messages (TOOBIG or DEST_UNREACH) to the source and
bail out. We needed the ability to send ICMP from the out_rt_v6 function
and DEST_UNREACH from the out_rt function, so we just pulled it out into a
common function.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Another step toward heterogeneous pools, this removes another piece of
functionality currently specific to each address family type.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This logic is repeated in both out_rt functions so it was redundant.
Additionally, we'll need to be able to do checks to route v4 to v6 and vice
versa in order to deal with heterogeneous pools.
This patch also updates the callsites to add an additional parameter to the
out route functions.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The synchronization protocol is not compatible with heterogeneous pools, so
we need to verify that we're not turning both on at the same time.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The assumption that dest af is equal to service af is now unreliable, so we
must specify it manually so as not to copy just the first 4 bytes of a v6
address or doing an illegal read of 16 butes on a v6 address.
We "lie" in two places: for synchronization (which we will explicitly
disallow from happening when we have heterogeneous pools) and for black
hole addresses where there's no real dest.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Part of a series of diffs to tease out destination family from virtual
family. This diff just adds a parameter to ip_vs_trash_get and then uses
it for comparison rather than svc->af.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We need to remove the assumption that virtual address family is the same as
real address family in order to support heterogeneous services (that is,
services with v4 vips and v6 backends or the opposite).
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is necessary to support heterogeneous pools. For example, if you have
an ipv6 addressed network, you'll want to be able to forward ipv4 traffic
into it.
This patch enforces that destination address family is the same as service
family, as none of the forwarding mechanisms support anything else.
For the old setsockopt mechanism, we simply set the dest address family to
AF_INET as we do with the service.
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add simple weighted IPVS failover support to the Linux kernel. All
other scheduling modules implement some form of load balancing, while
this offers a simple failover solution. Connections are directed to
the appropriate server based solely on highest weight value and server
availability. Tested functionality with keepalived.
Signed-off-by: Kenny Mathis <kmathis@chokepoint.net>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add skbinfo extension kernel support for the list set type.
Introduce the new revision of the list set type.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add skbinfo extension kernel support for the hash set types.
Inroduce the new revisions of all hash set types.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add skbinfo extension kernel support for the bitmap set types.
Inroduce the new revisions of bitmap_ip, bitmap_ipmac and bitmap_port set types.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Skbinfo extension provides mapping of metainformation with lookup in the ipset tables.
This patch defines the flags, the constants, the functions and the structures
for the data type independent support of the extension.
Note the firewall mark stores in the kernel structures as two 32bit values,
but transfered through netlink as one 64bit value.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Dan Carpenter reported the following static checker warning:
net/netfilter/ipset/ip_set_core.c:1414 call_ad()
error: 'nlh->nlmsg_len' from user is not capped properly
The payload size is limited now by the max size of size_t.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| |
| |
| |
| |
| |
| |
| | |
Users are starting to test nf_tables with no x_tables support. Therefore,
masquerading needs to be indenpendent of it from Kconfig.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that we have masquerading support in nf_tables, the NAT chain can
be use with it, not only for SNAT/DNAT. So make this chain type
independent of it.
While at it, move it inside the scope of 'if NF_NAT_IPV*' to simplify
dependencies.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Suggested by Stephen. Also drop inline keyword and let compiler decide.
gcc 4.7.3 decides to no longer inline tcp_ecn_check_ce, so split it up.
The actual evaluation is not inlined anymore while the ECN_OK test is.
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|