-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/TCP/GTEST: Protect against connection from non-UCX sock-based app #4612
Conversation
@akesandgren could test this solution pls? |
71eb09c
to
c3c5f1b
Compare
Mellanox CI: FAILED on 1 of 25 workers (click for details)Note: the logs will be deleted after 31-Dec-2019
|
Tests running |
@akesandgren appreciate your help with the testing of my patches! thank you! |
Mellanox CI: FAILED on 5 of 25 workers (click for details)Note: the logs will be deleted after 31-Dec-2019
|
Kicked off two new test runs with that last commit. Two also running with the code prior to it. |
ok, thank you for keeping me updated! |
Mellanox CI: UNKNOWN on 25 workers (click for details)Note: the logs will be deleted after 01-Jan-2020
|
Mellanox CI: UNKNOWN on 25 workers (click for details)Note: the logs will be deleted after 01-Jan-2020
|
Mellanox CI: FAILED on 3 of 25 workers (click for details)Note: the logs will be deleted after 01-Jan-2020
|
lab configuration issue and #4616 |
Mellanox CI: UNKNOWN on 25 workers (click for details)Note: the logs will be deleted after 02-Jan-2020
|
@brminich could you review when you get chance pls? |
Mellanox CI: UNKNOWN on 25 workers (click for details)Note: the logs will be deleted after 02-Jan-2020
|
04b6ea7
to
7ce549f
Compare
Mellanox CI: UNKNOWN on 25 workers (click for details)Note: the logs will be deleted after 02-Jan-2020
|
Mellanox CI: PASSED on 25 workers (click for details)Note: the logs will be deleted after 02-Jan-2020
|
src/uct/tcp/tcp.h
Outdated
@@ -21,6 +21,8 @@ | |||
|
|||
#define UCT_TCP_CONFIG_PREFIX "TCP_" | |||
|
|||
#define UCT_TCP_MAGIC_NUMBER 0xCAFEBABElu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better define a full 64 bit constant, or use it as 32-bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defined it as const
uint64_t
src/uct/tcp/tcp_cm.c
Outdated
ucs_error("tcp_ep %p: connection establishment for " | ||
"socket fd %d was unsuccessful", ep, ep->fd); | ||
goto err; | ||
status = uct_tcp_cm_send_magic_number(ep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we add just magic number to uct_tcp_cm_conn_req_pkt for simplicity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, we can't, since uct_tcp_cm_conn_req_pkt
is a part of parsing TCP AM packet:
| TCP AM header | payload |
where CONN req is transferred as a payload
but I move the logic responsible for sending magic number before all data to uct_tcp_cm_send_event()
- so, it simplifies the code
test/gtest/uct/tcp/test_tcp.cc
Outdated
} | ||
|
||
void test_listener_flood(entity& test_entity, size_t max_conn, | ||
size_t msg_size = 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need default value for msg_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
Mellanox CI: FAILED on 5 of 25 workers (click for details)Note: the logs will be deleted after 03-Jan-2020
|
Mellanox CI: FAILED on 3 of 25 workers (click for details)Note: the logs will be deleted after 04-Jan-2020
|
ddb47fc
to
9af4261
Compare
@yosefe fixed two static analyzer issues in the 3rd commit |
Mellanox CI: FAILED on 2 of 25 workers (click for details)Note: the logs will be deleted after 04-Jan-2020
|
Mellanox CI: FAILED on 2 of 25 workers (click for details)Note: the logs will be deleted after 04-Jan-2020
|
75abf4d
to
3bbf7e5
Compare
@akesandgren do you have any news regarding testing with this patch? |
bot:pipe:retest |
No crashes with commit 23f0ece after ~7 days. Haven't tried with anything newer yet. Wan't me to try with the current head? |
@akesandgren good news. thank you! |
Mellanox CI: FAILED on 12 of 25 workers (click for details)Note: the logs will be deleted after 09-Jan-2020
|
Mellanox CI: ABORTED on 25 workers (click for details)Note: the logs will be deleted after 09-Jan-2020
|
Mellanox CI: FAILED on 12 of 25 workers (click for details)Note: the logs will be deleted after 09-Jan-2020
|
Mellanox CI: UNKNOWN on 25 workers (click for details)Note: the logs will be deleted after 09-Jan-2020
|
Mellanox CI: PASSED on 25 workers (click for details)Note: the logs will be deleted after 09-Jan-2020
|
bot:pipe:retest |
I'm curious, what is typical protection in other TCP/IP based application assuming certain wire level protocols ? Shell we make the magic value user configurable, just in case it conflicts with something else ? |
The main reason for the protection is against random port sniffers and the likes. I see no reason to have it configurable. |
@dmitrygx how much effort is to back port this to v1.7.0 ? (this is follow up on email thread) |
@yosefe can you please take a look. 👍 from my side. |
BTW actually @brminich can approve this as well and it should be enough. |
} else if (ep->ctx_caps & UCS_BIT(UCT_TCP_EP_CTX_TYPE_RX)) { | ||
/* If the EP supports RX only, destroy it */ | ||
} else if ((ep->ctx_caps == 0) || | ||
(ep->ctx_caps & UCS_BIT(UCT_TCP_EP_CTX_TYPE_RX))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is situation when ep has no caps possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is the situation when we are trying to receive the "magic number" packet from the peer that connected to us and it fails by some reason
cm_pkt_length = sizeof(*conn_pkt); | ||
cm_pkt_length = sizeof(*conn_pkt); | ||
|
||
if (ep->conn_state == UCT_TCP_EP_CONN_STATE_CONNECTING) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to send UCT_TCP_CM_CONN_REQ
in non UCT_TCP_EP_CONN_STATE_CONNECTING
state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, when UCT/TCP EP was in CONNECTED state after that EP was created from accepting the connection and then the user wants to create new EP to the same peer (we found this EP from khash and send CONN_REQ mesage to the peer to enable RX capability on peer's side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe always sending magic number would be simpler?
src/uct/tcp/tcp_ep.c
Outdated
@@ -1001,6 +1017,66 @@ unsigned uct_tcp_ep_progress_put_rx(uct_tcp_ep_t *ep) | |||
return 1; | |||
} | |||
|
|||
static unsigned uct_tcp_ep_progress_data_rx(uct_tcp_ep_t *ep) | |||
{ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, fixed
aee0a33
to
233f890
Compare
👍 |
233f890
to
eaec8bd
Compare
Mellanox CI: UNKNOWN on 22 workers (click for details)Note: the logs will be deleted after 15-Jan-2020
|
Mellanox CI: PASSED on 25 workers (click for details)Note: the logs will be deleted after 16-Jan-2020
|
What
Protect UCT/TCP listener against unexpected connections from non-UCX socked-based applications.
Why ?
Fixes #4525 (master branch):
UCT/TCP accepts a connection from some client isn't UCT/TCP and tries to receive data from it, but a client violates UCT/TCP AM protocol.
How ?
CRC16(EP's peer address)
after a connection was established successfully.CRC16(iface's listener address)
and then move to the main thread.