[MPTCP][PATCH v7 mptcp-next 0/7] ADD_ADDR: ports support
by Geliang Tang
v7:
- use the MPTCP listening socket instead of TCP one
- release subflow_req->msk in subflow_init_req
- add mismatched port MIBs
- use sock_common in source_address
v6:
- create and bind the listening socket in mptcp_nl_cmd_add_addr.
- drop the patch "mptcp: add port number listened in kernel check" in
v5.
v5:
- use the per netns listening socket.
- First 8 patches in v4 had been merged to the export branch, drop them
from this patchset.
v4:
- hold msk->pm.lock in mptcp_pm_sport_in_anno_list.
- Merge the patchset 'Squash to "ADD_ADDR: ports support v3"' into v4.
v3:
- add two new patches, 8 and 11
- add more IS_ENABLED(CONFIG_MPTCP_IPV6) in patch 2
- define TCPOLEN_MPTCP_ADD_ADDR_HMAC in patch 4
- add flags check in patch 10
- update the testcases
v2:
- change mptcp_out_options's port field in CPU bype order.
- keep mptcp_options_received's port field in CPU bype order.
- add two new patches to simplify ADD_ADDR suboption writing.
- update mptcp_add_addr_len helper use adding up size.
- add more commit messages.
v1:
This series is the first version of ADD_ADDR ports support. I have solved
the listener problem which I mentioned at the meeting on 15th of October
by adding a new listening socket from the userspace (see patch 8). Up to
now this patchset works well.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/54
Geliang Tang (7):
mptcp: create the listening socket for new port
mptcp: add port number check for MP_JOIN
mptcp: add port number announced check
mptcp: deal with MPTCP_PM_ADDR_ATTR_PORT in PM netlink
selftests: mptcp: add port argument for pm_nl_ctl
mptcp: add the mibs for ADD_ADDR with port
selftests: mptcp: add testcases for ADD_ADDR with port
net/mptcp/mib.c | 6 +
net/mptcp/mib.h | 6 +
net/mptcp/options.c | 4 +
net/mptcp/pm_netlink.c | 97 ++++++++++++
net/mptcp/protocol.c | 2 +-
net/mptcp/protocol.h | 4 +
net/mptcp/subflow.c | 48 +++++-
.../testing/selftests/net/mptcp/mptcp_join.sh | 148 +++++++++++++++++-
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 24 ++-
9 files changed, 333 insertions(+), 6 deletions(-)
--
2.26.2
1 month, 1 week
[MPTCP][PATCH v6 mptcp-next 0/9] ADD_ADDR: ports support
by Geliang Tang
v6:
- create and bind the listening socket in mptcp_nl_cmd_add_addr.
- drop the patch "mptcp: add port number listened in kernel check" in
v5.
v5:
- use the per netns listening socket.
- First 8 patches in v4 had been merged to the export branch, drop them
from this patchset.
v4:
- hold msk->pm.lock in mptcp_pm_sport_in_anno_list.
- Merge the patchset 'Squash to "ADD_ADDR: ports support v3"' into v4.
v3:
- add two new patches, 8 and 11
- add more IS_ENABLED(CONFIG_MPTCP_IPV6) in patch 2
- define TCPOLEN_MPTCP_ADD_ADDR_HMAC in patch 4
- add flags check in patch 10
- update the testcases
v2:
- change mptcp_out_options's port field in CPU bype order.
- keep mptcp_options_received's port field in CPU bype order.
- add two new patches to simplify ADD_ADDR suboption writing.
- update mptcp_add_addr_len helper use adding up size.
- add more commit messages.
v1:
This series is the first version of ADD_ADDR ports support. I have solved
the listener problem which I mentioned at the meeting on 15th of October
by adding a new listening socket from the userspace (see patch 8). Up to
now this patchset works well.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/54
Geliang Tang (9):
mptcp: create the listening socket for new port
mptcp: set the listening socket's subflow
mptcp: release the listening socket
mptcp: add port number check for MP_JOIN
mptcp: add port number announced check
mptcp: deal with MPTCP_PM_ADDR_ATTR_PORT in PM netlink
selftests: mptcp: add port argument for pm_nl_ctl
mptcp: add the mibs for ADD_ADDR with port
selftests: mptcp: add testcases for ADD_ADDR with port
net/mptcp/mib.c | 4 +
net/mptcp/mib.h | 4 +
net/mptcp/options.c | 4 +
net/mptcp/pm_netlink.c | 91 +++++++++++++-
net/mptcp/protocol.h | 5 +
net/mptcp/subflow.c | 80 +++++++++++-
.../testing/selftests/net/mptcp/mptcp_join.sh | 114 +++++++++++++++++-
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 24 +++-
8 files changed, 315 insertions(+), 11 deletions(-)
--
2.26.2
1 month, 2 weeks
[MPTCP][PATCH v2 mptcp-next 0/2] remove address when netlink flush addrs and testcase
by Geliang Tang
v2:
- update the testcase.
This patchset removes address when netlink do flush addrs command, and
adds the testcase for flush addrs command.
Geliang Tang (2):
mptcp: remove address when netlink flush addrs
selftests: mptcp: add the flush addrs testcase
net/mptcp/pm_netlink.c | 15 ++++--
.../testing/selftests/net/mptcp/mptcp_join.sh | 50 +++++++++++++------
2 files changed, 46 insertions(+), 19 deletions(-)
--
2.26.2
1 month, 2 weeks
force existing service to use MPTCP
by Paolo Abeni
hello,
reviving this old topic.
I've experimented a bit with the LD_PRELOAD thing.
Looks like at least nginx and apache can be forced to use MPTCP instead
of TCP with a crafted unit file created automatically from the distro-
provided one.
e.g. for nginx, adding:
Conflicts=nginx.service
After=nginx.service
into the [unit] section, and:
Environment="LD_PRELOAD=/usr/lib64/use_mptcp.so"
ExecStartPre=sysctl -w net.mptcp.enabled=1
into the [Service] section.
Then I had to fight a bit with selinux. I did not really investigate
the issue, I think/fear selinux misunderstood mptcp sockets as raw
ones, so default policy fails. A bunch of:
ausearch -c 'nginx' --raw | audit2allow -M my-nginx
semodule -i my-nginx.pp
solved the problem.
Bottom line:
- the above looks tecnically viable [at least for some services]. I'm
looking for a more extended service/daemon list to investigate
fourther. I think we could/should really consider package the above in
mptcpd or the like.
- selinux (surprise, surprise!) can be a problem. Worth looking at it
(that is independent from the system we will pick to force MPTCP socket
usage)
Cheers,
Paolo
1 month, 3 weeks
[PATCH v3] mptcp: let MPTCP create max size skbs
by Paolo Abeni
Currently the xmit path of the MPTCP protocol creates smaller-
than-max-size skbs, which is suboptimal for the performances.
There are a few things to improve:
- when coalescing to an existing skb, must clear the PUSH flag
- tcp_build_frag() expect the available space as an argument.
When coalescing is enable MPTCP already subtracted the
to-be-coalesced skb len. We must increment said argument
accordingly.
Before:
./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
[...]
131072 16384 16384 30.00 24414.86
After:
./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
[...]
131072 16384 16384 30.05 28357.69
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
use_mptcp.sh forces exiting app to create MPTCP instead of TCP
ones via LD_PRELOAD of crafter socket() implementation.
https://github.com/pabeni/mptcp-tools/tree/master/use_mptcp
---
v2 -> v3:
- drop the tcp bits. They caused stream corruption which
could not be easily set, and dropping them does not affect
the performance in a visible way, since that code path is
hit only on corner cases
v1 -> v2:
- prevent splitting if from_ext is frozen: should never happen
but is cheap
- provide dummy mptcp_skb_split() for non MPTCP build
---
net/mptcp/protocol.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2a8174a7e630..82525d454c5e 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1256,6 +1256,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
struct mptcp_ext *mpext = NULL;
struct sk_buff *skb, *tail;
bool can_collapse = false;
+ int size_bias = 0;
int avail_size;
size_t ret = 0;
@@ -1277,10 +1278,12 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
mpext = skb_ext_find(skb, SKB_EXT_MPTCP);
can_collapse = (info->size_goal - skb->len > 0) &&
mptcp_skb_can_collapse_to(data_seq, skb, mpext);
- if (!can_collapse)
+ if (!can_collapse) {
TCP_SKB_CB(skb)->eor = 1;
- else
+ } else {
+ size_bias = skb->len;
avail_size = info->size_goal - skb->len;
+ }
}
/* Zero window and all data acked? Probe. */
@@ -1300,8 +1303,8 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
return 0;
ret = info->limit - info->sent;
- tail = tcp_build_frag(ssk, avail_size, info->flags, dfrag->page,
- dfrag->offset + info->sent, &ret);
+ tail = tcp_build_frag(ssk, avail_size + size_bias, info->flags,
+ dfrag->page, dfrag->offset + info->sent, &ret);
if (!tail) {
tcp_remove_empty_skb(sk, tcp_write_queue_tail(ssk));
return -ENOMEM;
@@ -1310,8 +1313,9 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
/* if the tail skb is still the cached one, collapsing really happened.
*/
if (skb == tail) {
- WARN_ON_ONCE(!can_collapse);
+ TCP_SKB_CB(tail)->tcp_flags &= ~TCPHDR_PSH;
mpext->data_len += ret;
+ WARN_ON_ONCE(!can_collapse);
WARN_ON_ONCE(zero_window_probe);
goto out;
}
--
2.26.2
1 month, 3 weeks
[MPTCP][PATCH mptcp-next 0/5] MP_PRIO support
by Geliang Tang
v1:
- add MP_PRIO PM netlink support
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/51
Geliang Tang (5):
mptcp: add the outgoing MP_PRIO support
mptcp: add the incoming MP_PRIO support
mptcp: deal with MPTCP_PM_ADDR_FLAG_BACKUP in PM netlink
mptcp: add the mibs for MP_PRIO
selftests: mptcp: add the MP_PRIO testcases
net/mptcp/mib.c | 2 +
net/mptcp/mib.h | 2 +
net/mptcp/options.c | 56 +++++++++++++
net/mptcp/pm.c | 19 +++++
net/mptcp/pm_netlink.c | 81 +++++++++++++++++++
net/mptcp/protocol.h | 11 +++
.../testing/selftests/net/mptcp/mptcp_join.sh | 81 ++++++++++++++++++-
7 files changed, 251 insertions(+), 1 deletion(-)
--
2.26.2
1 month, 3 weeks
[MPTCP][PATCH mptcp-next] mptcp: enable use_port when invoke addresses_equal
by Geliang Tang
When dealing with the addresses list local_addr_list or anno_list, we
should enables the function addresses_equal's parameter use_port.
Signed-off-by: Geliang Tang <geliangtang(a)gmail.com>
---
The patch should be inserted into the "ADD_ADDR: ports support" patchset
after the 3rd patch "mptcp: add port number announced check".
---
net/mptcp/pm_netlink.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 119d7abdc997..54c6b6359144 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -122,7 +122,7 @@ static bool lookup_subflow_by_saddr(const struct list_head *list,
skc = (struct sock_common *)mptcp_subflow_tcp_sock(subflow);
local_address(skc, &cur);
- if (addresses_equal(&cur, saddr, false))
+ if (addresses_equal(&cur, saddr, saddr->port))
return true;
}
@@ -195,7 +195,7 @@ lookup_anno_list_by_saddr(struct mptcp_sock *msk,
struct mptcp_pm_add_entry *entry;
list_for_each_entry(entry, &msk->pm.anno_list, list) {
- if (addresses_equal(&entry->addr, addr, false))
+ if (addresses_equal(&entry->addr, addr, true))
return entry;
}
@@ -604,7 +604,7 @@ int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct sock_common *skc)
rcu_read_lock();
list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) {
- if (addresses_equal(&entry->addr, &skc_local, false)) {
+ if (addresses_equal(&entry->addr, &skc_local, entry->addr.port)) {
ret = entry->addr.id;
break;
}
--
2.26.2
1 month, 3 weeks
[MPTCP][PATCH mptcp-next] Squash to "[MPTCP][PATCH v7 mptcp-next 3/7] mptcp: add port number announced check"
by Geliang Tang
Drop source_address helper.
Signed-off-by: Geliang Tang <geliangtang(a)gmail.com>
---
net/mptcp/pm_netlink.c | 17 ++---------------
1 file changed, 2 insertions(+), 15 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 4f1b7f44c03b..119d7abdc997 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -88,8 +88,8 @@ static bool address_zero(const struct mptcp_addr_info *addr)
static void local_address(const struct sock_common *skc,
struct mptcp_addr_info *addr)
{
- addr->port = 0;
addr->family = skc->skc_family;
+ addr->port = htons(skc->skc_num);
if (addr->family == AF_INET)
addr->addr.s_addr = skc->skc_rcv_saddr;
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
@@ -111,19 +111,6 @@ static void remote_address(const struct sock_common *skc,
#endif
}
-static void source_address(struct sock_common *skc,
- struct mptcp_addr_info *addr)
-{
- addr->family = skc->skc_family;
- addr->port = htons(skc->skc_num);
- if (addr->family == AF_INET)
- addr->addr.s_addr = skc->skc_rcv_saddr;
-#if IS_ENABLED(CONFIG_MPTCP_IPV6)
- else if (addr->family == AF_INET6)
- addr->addr6 = skc->skc_v6_rcv_saddr;
-#endif
-}
-
static bool lookup_subflow_by_saddr(const struct list_head *list,
struct mptcp_addr_info *saddr)
{
@@ -221,7 +208,7 @@ bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk)
struct mptcp_addr_info saddr;
bool ret = false;
- source_address((struct sock_common *)sk, &saddr);
+ local_address((struct sock_common *)sk, &saddr);
spin_lock_bh(&msk->pm.lock);
list_for_each_entry(entry, &msk->pm.anno_list, list) {
--
2.26.2
1 month, 3 weeks
[PATCH net-next 0/3] mptcp: reject invalid mp_join requests right away
by Florian Westphal
At the moment MPTCP can detect an invalid join request (invalid token,
max number of subflows reached, and so on) right away but cannot reject
the connection until the 3WHS has completed.
Instead the connection will complete and the subflow is reset afterwards.
To send the reset most information is already available, but we don't have
good spot where the reset could be sent:
1. The ->init_req callback is too early and also doesn't allow to return an
error that could be used to inform the TCP stack that the SYN should be
dropped.
2. The ->route_req callback lacks the skb needed to send a reset.
3. The ->send_synack callback is the best fit from the available hooks,
but its called after the request socket has been inserted into the queue
already. This means we'd have to remove it again right away.
From a technical point of view, the second hook would be best:
1. Its before insertion into listener queue.
2. If it returns NULL TCP will drop the packet for us.
Problem is that we'd have to pass the skb to the function just for MPTCP.
Paolo suggested to merge init_req and route_req callbacks instead:
This makes all info available to MPTCP -- a return value of NULL drops the
packet and MPTCP can send the reset if needed.
Because 'route_req' has a 'const struct sock *', this means either removal
of const qualifier, or a bit of code churn to pass 'const' in security land.
This does the latter; I did not find any spots that need write access to struct
sock.
To recap, the two alternatives are:
1. Solve it entirely in MPTCP: use the ->send_synack callback to
unlink the request socket from the listener & drop it.
2. Avoid 'security' churn by removing the const qualifier.
1 month, 3 weeks
[PATCH net-next v2] mptcp: be careful on MPTCP-level ack.
by Paolo Abeni
We can enter the main mptcp_recvmsg() loop even when
no subflows are connected. As note by Eric, that would
result in a divide by zero oops on ack generation.
Address the issue by checking the subflow status before
sending the ack.
Additionally protect mptcp_recvmsg() against invocation
with weird socket states.
v1 -> v2:
- removed unneeded inline keyword - Jakub
Reported-and-suggested-by: Eric Dumazet <eric.dumazet(a)gmail.com>
Fixes: ea4ca586b16f ("mptcp: refine MPTCP-level ack scheduling")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
net/mptcp/protocol.c | 67 ++++++++++++++++++++++++++++++++------------
1 file changed, 49 insertions(+), 18 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 4b7794835fea..371a5e691a9a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -419,31 +419,57 @@ static bool mptcp_subflow_active(struct mptcp_subflow_context *subflow)
return ((1 << ssk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT));
}
-static void mptcp_send_ack(struct mptcp_sock *msk, bool force)
+static bool tcp_can_send_ack(const struct sock *ssk)
+{
+ return !((1 << inet_sk_state_load(ssk)) &
+ (TCPF_SYN_SENT | TCPF_SYN_RECV | TCPF_TIME_WAIT | TCPF_CLOSE));
+}
+
+static void mptcp_send_ack(struct mptcp_sock *msk)
{
struct mptcp_subflow_context *subflow;
- struct sock *pick = NULL;
mptcp_for_each_subflow(msk, subflow) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
- if (force) {
- lock_sock(ssk);
+ lock_sock(ssk);
+ if (tcp_can_send_ack(ssk))
tcp_send_ack(ssk);
- release_sock(ssk);
- continue;
- }
-
- /* if the hintes ssk is still active, use it */
- pick = ssk;
- if (ssk == msk->ack_hint)
- break;
+ release_sock(ssk);
}
- if (!force && pick) {
- lock_sock(pick);
- tcp_cleanup_rbuf(pick, 1);
- release_sock(pick);
+}
+
+static bool mptcp_subflow_cleanup_rbuf(struct sock *ssk)
+{
+ int ret;
+
+ lock_sock(ssk);
+ ret = tcp_can_send_ack(ssk);
+ if (ret)
+ tcp_cleanup_rbuf(ssk, 1);
+ release_sock(ssk);
+ return ret;
+}
+
+static void mptcp_cleanup_rbuf(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ /* if the hinted ssk is still active, try to use it */
+ if (likely(msk->ack_hint)) {
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+ if (msk->ack_hint == ssk &&
+ mptcp_subflow_cleanup_rbuf(ssk))
+ return;
+ }
}
+
+ /* otherwise pick the first active subflow */
+ mptcp_for_each_subflow(msk, subflow)
+ if (mptcp_subflow_cleanup_rbuf(mptcp_subflow_tcp_sock(subflow)))
+ return;
}
static bool mptcp_check_data_fin(struct sock *sk)
@@ -494,7 +520,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
ret = true;
mptcp_set_timeout(sk, NULL);
- mptcp_send_ack(msk, true);
+ mptcp_send_ack(msk);
mptcp_close_wake_up(sk);
}
return ret;
@@ -1579,6 +1605,11 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
return -EOPNOTSUPP;
lock_sock(sk);
+ if (unlikely(sk->sk_state == TCP_LISTEN)) {
+ copied = -ENOTCONN;
+ goto out_err;
+ }
+
timeo = sock_rcvtimeo(sk, nonblock);
len = min_t(size_t, len, INT_MAX);
@@ -1604,7 +1635,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
/* be sure to advertise window change */
old_space = READ_ONCE(msk->old_wspace);
if ((tcp_space(sk) - old_space) >= old_space)
- mptcp_send_ack(msk, false);
+ mptcp_cleanup_rbuf(msk);
/* only the master socket status is relevant here. The exit
* conditions mirror closely tcp_recvmsg()
--
2.26.2
1 month, 3 weeks