[PATCH v3 0/4] iproute: mptcp support
by Paolo Abeni
This improves ip-mptcp help, addressing feedback from Mat.
Also includes ss man page update.
Davide Caratti (1):
ss: allow dumping MPTCP subflow information
Paolo Abeni (3):
uapi: update linux/mptcp.h
add support for mptcp netlink interface
man: mptcp man page
include/uapi/linux/mptcp.h | 89 ++++++++
ip/Makefile | 2 +-
ip/ip.c | 3 +-
ip/ip_common.h | 1 +
ip/ipmptcp.c | 436 +++++++++++++++++++++++++++++++++++++
man/man8/ip-mptcp.8 | 142 ++++++++++++
man/man8/ss.8 | 5 +
misc/ss.c | 62 ++++++
8 files changed, 738 insertions(+), 2 deletions(-)
create mode 100644 include/uapi/linux/mptcp.h
create mode 100644 ip/ipmptcp.c
create mode 100644 man/man8/ip-mptcp.8
--
2.21.1
2 years, 2 months
[PATCH RFC] net: mptcp: improve fallback after successful 3-way-handshake
by Davide Caratti
From: Davide Caratti <dcaratti(a)nst.lab.eng.brq.redhat.com>
a socket can receive data without a DSS option after MP_CAPABLE exchange
in the three way handshake. In this case, it must fall-back to normal
TCP in accordance to RFC8684 §3.7. In addition, fall-back to TCP must
occur if the peer sends data with a DSS option with zero DLL (so-called
"infinite mapping"). In case a server needs to fallback and msk->subflow
is NULL, msk->first can be used for send/receive operations.
Signed-off-by: Davide Caratti <dcaratti(a)redhat.com>
---
net/mptcp/protocol.c | 20 +++++++++++++++++++-
net/mptcp/subflow.c | 18 ++++++++++++++----
2 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 457655ea875b..045aacb68e13 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -736,6 +736,12 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
pr_debug("fallback passthrough");
ret = sock_sendmsg(ssock, msg);
return ret >= 0 ? ret + copied : (copied ? copied : ret);
+ } else if (__mptcp_needs_tcp_fallback(msk)) {
+ ssk = msk->first;
+ release_sock((struct sock *)msk);
+ pr_debug("msk->first passthrough");
+ ret = ssk->sk_prot->sendmsg(ssk, msg, len);
+ return ret >= 0 ? ret + copied : (copied ? copied : ret);
}
mptcp_clean_una(sk);
@@ -899,6 +905,14 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
mptcp_subflow_ctx(ssock->sk));
copied = sock_recvmsg(ssock, msg, flags);
return copied;
+ } else if (__mptcp_needs_tcp_fallback(msk)) {
+use_mskfirst:
+ pr_debug("fallback-read using msk->first");
+ sk = msk->first;
+ release_sock((struct sock *)msk);
+ copied = sk->sk_prot->recvmsg(sk, msg, len, nonblock, flags,
+ addr_len);
+ return copied;
}
timeo = sock_rcvtimeo(sk, nonblock);
@@ -963,8 +977,12 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
pr_debug("block timeout %ld", timeo);
mptcp_wait_data(sk, &timeo);
- if (unlikely(__mptcp_tcp_fallback(msk)))
+ if (unlikely(__mptcp_tcp_fallback(msk))) {
+ ssock = __mptcp_tcp_fallback(msk);
+ if (!ssock)
+ goto use_mskfirst;
goto fallback;
+ }
}
if (skb_queue_empty(&sk->sk_receive_queue)) {
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 47f901b712f9..460288c1c8ff 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -460,7 +460,8 @@ enum mapping_status {
MAPPING_OK,
MAPPING_INVALID,
MAPPING_EMPTY,
- MAPPING_DATA_FIN
+ MAPPING_DATA_FIN,
+ MAPPING_INFINITE
};
static u64 expand_seq(u64 old_seq, u16 old_data_len, u64 seq)
@@ -540,8 +541,14 @@ static enum mapping_status get_mapping_status(struct sock *ssk)
return MAPPING_EMPTY;
}
- if (!subflow->map_valid)
+ if (!subflow->map_valid) {
+ if (!mpext && !subflow->local_id) {
+ pr_debug("no mpext and 0 local id");
+ skb_ext_del(skb, SKB_EXT_MPTCP);
+ return MAPPING_INFINITE;
+ }
return MAPPING_INVALID;
+ }
goto validate_seq;
}
@@ -552,9 +559,8 @@ static enum mapping_status get_mapping_status(struct sock *ssk)
data_len = mpext->data_len;
if (data_len == 0) {
- pr_err("Infinite mapping not handled");
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPRX);
- return MAPPING_INVALID;
+ return subflow->local_id ? MAPPING_INVALID : MAPPING_INFINITE;
}
if (mpext->data_fin == 1) {
@@ -663,6 +669,10 @@ static bool subflow_check_data_avail(struct sock *ssk)
ssk->sk_err = EBADMSG;
goto fatal;
}
+ if (status == MAPPING_INFINITE) {
+ tcp_sk(ssk)->is_mptcp = 0;
+ return false;
+ }
if (status != MAPPING_OK)
return false;
--
2.25.2
2 years, 2 months
[PATCH net] mptcp: fix data_fin handing in RX path
by Paolo Abeni
The data fin flag is set only via a DSS option, but currently
mptcp_incoming_options() copies it inconditionally from the
provided RX options.
Since the tcp sock RX options are not explicitly cleared on
socket free/alloc cycle, we can end-up with a stray data_fin
value while parsing e.g. MPC packets.
That would lead to mapping data corruption and will trigger
a few WARN_ON() in the RX path.
Instead of adding a costly memset(), fetch the data_fin flag
only for DSS packets - when we always expicitly initialize
such bit at option parsing time.
Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
net/mptcp/options.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 3d541b60dcf3..129e9c214b2d 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -878,12 +878,11 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
mpext->data_seq = mp_opt->data_seq;
mpext->subflow_seq = mp_opt->subflow_seq;
mpext->dsn64 = mp_opt->dsn64;
+ mpext->data_fin = mp_opt->data_fin;
}
mpext->data_len = mp_opt->data_len;
mpext->use_map = 1;
}
-
- mpext->data_fin = mp_opt->data_fin;
}
void mptcp_write_options(__be32 *ptr, struct mptcp_out_options *opts)
--
2.21.1
2 years, 2 months
[PATCH -next] mptcp/pm_netlink.c : add check for nla_put_in6_addr
by Bo YU
Normal there should be checked for nla_put_in6_addr like other
usage in net.
Detected by CoverityScan, CID# 1461639
Fixes: 01cacb00b35c("mptcp: add netlink-based PM")
Signed-off-by: Bo YU <tsu.yubo(a)gmail.com>
---
BWT, I am not sure nla_put_in_addr whether or not to do such that
---
net/mptcp/pm_netlink.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 86d61ab34c7c..f340b00672e1 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -603,8 +603,9 @@ static int mptcp_nl_fill_addr(struct sk_buff *skb,
nla_put_in_addr(skb, MPTCP_PM_ADDR_ATTR_ADDR4,
addr->addr.s_addr);
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
- else if (addr->family == AF_INET6)
- nla_put_in6_addr(skb, MPTCP_PM_ADDR_ATTR_ADDR6, &addr->addr6);
+ else if (addr->family == AF_INET6 &&
+ nla_put_in6_addr(skb, MPTCP_PM_ADDR_ATTR_ADDR6, &addr->addr6))
+ goto nla_put_failure;
#endif
nla_nest_end(skb, attr);
return 0;
--
2.11.0
2 years, 2 months
clearing mptcpRX options
by Paolo Abeni
hi all,
currently we hook mptcp rx option clearing inside tcp_clear_options().
It looks like it's not enough, as tcp_clear_options() is not called
before each tcp_parse_options().
Overall effect I can observe is that mptcp_incoming_options() sees
opt_rx->mptcp.mp_capable == 1, while the current packet does not carry
any MP_CAPABLE option.
Am I missing anything important here?!?
A trivial fix would be moving:
#if IS_ENABLED(CONFIG_MPTCP)
opt_rx->mptcp.mp_capable = 0;
opt_rx->mptcp.mp_join = 0;
opt_rx->mptcp.add_addr = 0;
opt_rx->mptcp.rm_addr = 0;
opt_rx->mptcp.dss = 0;
+#endif
from tcp_clear_options() to tcp_parse_options(), but I feel dizzy to
touch the TCP code again. Perhaps we should consider aliasing/unioning
the above fields with a single 'sub_opts' byte so we can clear them at
once?
Additionally, it now looks like to me that the TCP code stores very
little per packet information in 'opt_rx'. Should we try to move most
MPTCP per packet information out of mptcp_options_received, too ?!?
Thanks,
Paolo
2 years, 2 months
[kselftest UNSTABLE] mptcp/export 531346f5a4 ("tcp: mptcp: use mptcp receive buffer space to select rcv window")
by kernel test robot
LINUX COMMIT
============
commit : 531346f5a4531ca65fade2ec1d4c5beea97e5864
subject: tcp: mptcp: use mptcp receive buffer space to select rcv window
date : 2020-04-09 09:17:34 +0800
author : Florian Westphal <fw(a)strlen.de>
Test status of 531346f5a4531ca65fade2ec1d4c5beea97e5864 (compared to v5.6)
==========================================================================
+-------------------------------------------------------------------------------+------+------+----------+------+---+------------+-----+
| | pass | fail | unstable | skip | - | regression | fix |
| kselftests-bpf/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests | 42 | 7 | | 3 | | | |
| kselftests-mptcp/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests | 3 | | | | | | |
+-------------------------------------------------------------------------------+------+------+----------+------+---+------------+-----+
Fails
=====
kselftests-bpf/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests
---
- kernel-selftests.bpf.test_progs.fail
- kernel-selftests.bpf.test_align.fail
- kernel-selftests.bpf.test_btf.fail
- kernel-selftests.bpf.test_sysctl.fail
- kernel-selftests.bpf.test_progs-no_alu32.fail
- kernel-selftests.bpf.test_bpftool.sh.fail
- kernel-selftests.bpf.test_tc_edt.sh.fail
2 years, 2 months
[kselftest UNSTABLE] mptcp/export 474ece0829 ("tcp: mptcp: use mptcp receive buffer space to select rcv window")
by kernel test robot
LINUX COMMIT
============
commit : 474ece0829efb6a3ff213b7258706cc8f0cda3e1
subject: tcp: mptcp: use mptcp receive buffer space to select rcv window
date : 2020-04-20 09:06:52 +0800
author : Florian Westphal <fw(a)strlen.de>
Test status of 474ece0829efb6a3ff213b7258706cc8f0cda3e1 (compared to v5.6)
==========================================================================
+-------------------------------------------------------------------------------+------+------+----------+------+---+------------+-----+
| | pass | fail | unstable | skip | - | regression | fix |
| kselftests-bpf/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests | 42 | 6 | | 3 | | | |
| kselftests-mptcp/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests | 3 | | | | | | |
+-------------------------------------------------------------------------------+------+------+----------+------+---+------------+-----+
Fails
=====
kselftests-bpf/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests
---
- kernel-selftests.bpf.test_progs.fail
- kernel-selftests.bpf.test_align.fail
- kernel-selftests.bpf.test_tcpbpf_user.fail
- kernel-selftests.bpf.test_btf.fail
- kernel-selftests.bpf.test_sysctl.fail
- kernel-selftests.bpf.test_progs-no_alu32.fail
2 years, 2 months
[PATCH v2 0/4] iproute: mptcp support
by Paolo Abeni
This addresses the feedback from Davide, includes the 'ss' support - still
from Davide and also a very tentative man page.
Any feedback welcome!
Davide Caratti (1):
ss: allow dumping MPTCP subflow information
Paolo Abeni (3):
uapi: update linux/mptcp.h
add support for mptcp netlink interface.
man: mptcp man page
include/uapi/linux/mptcp.h | 89 ++++++++
ip/Makefile | 2 +-
ip/ip.c | 3 +-
ip/ip_common.h | 1 +
ip/ipmptcp.c | 449 +++++++++++++++++++++++++++++++++++++
man/man8/ip-mptcp.8 | 142 ++++++++++++
misc/ss.c | 62 +++++
7 files changed, 746 insertions(+), 2 deletions(-)
create mode 100644 include/uapi/linux/mptcp.h
create mode 100644 ip/ipmptcp.c
create mode 100644 man/man8/ip-mptcp.8
--
2.21.1
2 years, 2 months
[kselftest UNSTABLE] mptcp/export 100f845620 ("tcp: mptcp: use mptcp receive buffer space to select rcv window")
by kernel test robot
LINUX COMMIT
============
commit : 100f845620e7d7d4f9e457945cb614a5991f0f2b
subject: tcp: mptcp: use mptcp receive buffer space to select rcv window
date : 2020-04-17 09:16:42 +0800
author : Florian Westphal <fw(a)strlen.de>
Test status of 100f845620e7d7d4f9e457945cb614a5991f0f2b (compared to v5.6)
==========================================================================
+-------------------------------------------------------------------------------+------+------+----------+------+---+------------+-----+
| | pass | fail | unstable | skip | - | regression | fix |
| kselftests-bpf/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests | 43 | 5 | | 3 | | | |
| kselftests-mptcp/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests | 3 | | | | | | |
+-------------------------------------------------------------------------------+------+------+----------+------+---+------------+-----+
Fails
=====
kselftests-bpf/vm-snb/debian-x86_64-20191114.cgz/x86_64-rhel-7.6-kselftests
---
- kernel-selftests.bpf.test_progs.fail
- kernel-selftests.bpf.test_align.fail
- kernel-selftests.bpf.test_btf.fail
- kernel-selftests.bpf.test_sysctl.fail
- kernel-selftests.bpf.test_progs-no_alu32.fail
2 years, 2 months
[Weekly meetings] MoM - 16th of April 2020
by Matthieu Baerts
Hello,
We just had our 95th meeting with Mat and Ossama (Intel OTC), Christoph
(Apple), Paolo, Davide and Florian (RedHat) and myself (Tessares).
Thanks again for this new good meeting!
Here are the minutes of the meeting:
Accepted patches:
- The list of accepted patches can be seen on PatchWork:
https://patchwork.ozlabs.org/project/mptcp/list/?state=3
netdev (if mptcp ML is in cc) (by: Florian Westphal):
1269367: Deferred: [net] mptcp: fix double-unlock in mptcp_poll
our repo (by: /):
/
Pending patches:
- The list of pending patches can be seen on PatchWork:
https://patchwork.ozlabs.org/project/mptcp/list/?state=*
netdev (if mptcp ML is in cc) (by: Hillf Danton):
1269397: Rejected: Re: WARNING: bad unlock balance in mptcp_shutdown
our repo (by: Davide Caratti, Florian Westphal, Matthieu Baerts,
Paolo Abeni):
1265469: Needs Review / ACK: selftests:mptcp:pm: rm the right tmp file:
- To be reviewed (one line)
1266897: Under Review: mptcp:pm netlink: fix variable scope:
- Drop? Yes!
1271115: Changes Requested: [v2,1/4] uapi: update linux/mptcp.h
1271117: Changes Requested: [v2,2/4] add support for mptcp netlink
interface.
1271119: Changes Requested: [v2,3/4] ss: allow dumping MPTCP subflow
information
1271118: Changes Requested: [v2,4/4] man: mptcp man page:
- v3 in preparation
1271341: New: [1/7] mptcp: fix splat when incoming connection is never
accepted before exit/close
1271343: New: [2/7] mptcp: fix 'Attempt to release TCP socket in state'
warnings
1271344: New: [3/7] mptcp: handle mptcp listener destruction via rcu
1271345: New: [4/7] mptcp: avoid callback invocation when mptcp parent
socket doesn't exist
1271346: New: [5/7] mptcp: use rcu helpers to fetch ulp subflow context
1271347: New: [6/7] mptcp: reverse order of sk_state_change and is_mptcp
check
1271348: New: [7/7] mptcp: prevent null deref crash on normal tcp sockets:
- The 4th and 7th should go together
- Raise condition regarding the 3rd ACK
- Still investigating because the cause is still unclear, maybe
there is another issue we would hide there
- the first 3 patches can be applied, ACK from Paolo.
- Matth can applied them at the end of the export branch
- Florian can also send them directly to netdev (-net)
Bugs on Github:
2: reduce hooking in TCP code → could go upstream before part 4
4: keep a single work struct in mptcp socket → could go upstream
before part 4
5: Allow ss/netstat etc. to show program name for client and
listener MPTCP subflows → could go upstream before part 4
7: cleanup sendmsg_frag allocation
8: fix possible race in subflow_finish_connect()
9: fix "IPv4: Attempt to release TCP socket in state 1 "... on shutdown
10: reduce mptcp options space usage
11: fix fallback to TCP... @dcaratti
3: fix 'mmap' related race
6: loss and delay without reordering causes very slow transfer
13: [syzkaller] WARNING in mptcp_incoming_options
12: [syzkaller] INFO: task hung in lock_sock_nested
FYI: Current Roadmap:
- Part 4 (next merge window):
- Fix bugs reported on Github:
https://github.com/multipath-tcp/mptcp_net-next/issues/
- Shared recv window (full support)
- IPv6 - IPv4 mapped support
- not dropping MPTCP options (ADD_ADDR, etc.)
- FAST_CLOSE
- full MPTCP v1 support (reliable add_addr, etc.)
- after a few attempts of failed MPTCP, we fallback to TCP
(like TFO is doing)
- PM server (more advanced)
- Full DATA_FIN support [WIP by Mat]:
- could be nice to have it: if ready
- Active backup support
- ADD_ADDR for MPTCPv1: echo bit [WIP by Peter]
- Opti in TCP option structures (unions) [to be rebased]
- Part 5 (extra needed for prod):
- opti/perfs
- TFO
- PM netlink
- PM bpf
- Scheduler bpf
- syncookies
- [gs]etsockopt per subflow
- notify the userspace when a subflow is added/removed → cmsg
Extra tests:
- news about Syzkaller? (Christoph):
- syzbot reported new issues on netdev, all already addressed
- "WARNING in mptcp_incoming_options": but no reproducer
- new about Intel's kbuild? (Mat):
- there was a report this week but unclear if the errors were
due to our modifications
-
https://lists.01.org/hyperkitty/list/mptcp@lists.01.org/thread/TLP3WL3AFX...
- Mat will follow-up
- packetdrill (Davide):
- could be nice to use this new version with the out-of-tree
kernel!
- CI (Matth):
- /
Next meeting:
- We propose to have the next meeting on Thursday, the 23rd of April.
- Usual time: 16:00 UTC (9am PDT, 6pm CEST)
- Still open to everyone!
- https://annuel2.framapad.org/p/mptcp_upstreaming_20200423
Feel free to comment on these points and propose new ones for the next
meeting!
Talk to you next week,
Matt
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium
2 years, 2 months