Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot didn't join the conversation/room #4

Closed
Dual-0 opened this issue Nov 28, 2020 · 14 comments
Closed

Bot didn't join the conversation/room #4

Dual-0 opened this issue Nov 28, 2020 · 14 comments
Labels
bug Something isn't working

Comments

@Dual-0
Copy link

Dual-0 commented Nov 28, 2020

Hello,

I installed the bot without docker (tried also with docker). When I invite the bot to a room/conversation the bot didn't join. Status keeps invitation pending. I didn't found any errors in the homeserver.log.

Thanks in advance.

@geluk geluk added the bug Something isn't working label Nov 28, 2020
@geluk
Copy link
Owner

geluk commented Nov 28, 2020

Can you show the logs generated by the bot on startup? It would also be helpful if you could raise the log verbosity first. You can do this by starting the application like so: npm run start -- -vv. Note that they will contain private information such as your homeserver URL. and appservice token, so please make sure to remove that before sharing.

@Dual-0
Copy link
Author

Dual-0 commented Nov 28, 2020

installed the webhook from scratch via

  • git clone
  • sudo npm ci
  • sudo npm run start

debug log...

2020-11-28 13:51:06.778.000      DEBUG  [webhook-srv]           Loading configuration file: './gateway-config.yaml'
2020-11-28 13:51:06.817.000      DEBUG  [webhook-srv]           Generating appservice.yaml
2020-11-28 13:51:06.821.000      DEBUG  [webhook-srv]           Creating DB connection with sqlite3
2020-11-28 13:51:06.891.000      SILLY  [webhook-srv]           Retrieving migration status
2020-11-28 13:51:06.929.000      DEBUG  [webhook-srv]           There are no pending migrations
2020-11-28 13:51:06.934.000      SILLY  [webhook-srv]           Bridge configuration
AppServiceRegistration {
  url: 'http://127.0.0.1:8023',
  id: 'webhook-gateway',
  hsToken: 'hasc3somemore',
  asToken: 'nks4jcsomemore',
  senderLocalpart: 'testwebhook',
  rateLimited: true,
  namespaces: {
    users: [
      {
        exclusive: true,
        regex: 'testwebhook'
      }
    ],
    aliases: [],
    rooms: []
  },
  protocols: null,
  cachedRegex: {}
}
2020-11-28 13:51:06.939.000      SILLY  [webhook-srv]           Starting Matrix bridge
2020-11-28 13:51:06.950.000      INFO   [webhook-srv]           Matrix bridge running on 0.0.0.0:8023
2020-11-28 13:51:06.967.000      SILLY  [bridge]                [-] POST http://127.0.0.1:8008/_matrix/client/r0/register (AS) Body: "{\"username\":\"testwebhook\"}"
2020-11-28 13:51:07.41.000       SILLY  [bridge]                [-] POST http://127.0.0.1:8008/_matrix/client/r0/register (AS) HTTP 200 "{\"user_id\":\"@testwebhook:mydomain.tld\",\"home_server\":\"mydomain.tld\",\"access_token\
2020-11-28 13:51:07.43.000       SILLY  [bridge]                [-] PUT http://127.0.0.1:8008/_matrix/client/r0/profile/%40testwebhook%3Amydomain.tld/displayname (AS) Body: "{\"displayname\":\"Webhook\"}"
2020-11-28 13:51:07.69.000       SILLY  [bridge]                [-] PUT http://127.0.0.1:8008/_matrix/client/r0/profile/%40testwebhook%3Amydomain.tld/displayname (AS) HTTP 200 "{}"
2020-11-28 13:51:07.71.000       SILLY  [webhook-srv]           Starting webhook listener
2020-11-28 13:51:07.78.000       DEBUG  [webhook-srv]           Loading plugin from /opt/matrix-webhook-gateway/plugins/__cache/fb4bb74a5bec22dfe1beab71e0e85c195ca111ad432e85b15f0b129393d16eab.js
2020-11-28 13:51:07.83.000       DEBUG  [webhook-srv]           Loading plugin from /opt/matrix-webhook-gateway/plugins/__cache/33f495809f8236b21237c7e3556aba0851f07b9ac517b80ec57d98eab31f36a6.js
2020-11-28 13:51:07.86.000       INFO   [webhook-srv]           Loaded plugins: prometheus, sample
2020-11-28 13:51:07.89.000       INFO   [webhook-srv]           Web server running on 0.0.0.0:8020

after that I started as described
image

and in homeserver.log it started right:
2020-11-28 15:05:36,817 - synapse.config.appservice - 87 - INFO - None - Loaded application service: ApplicationService: {'token': '<redacted>', 'url': 'http://127.0.0.1:8023', 'hs_token': '<redacted>', 'sender': '@testwebhook:mydomain.tld', 'server_name': 'mydomain.tld', 'namespaces': {'users': [{'exclusive': True, 'regex': re.compile('@_hook_.*_.*')}], 'aliases': [], 'rooms': []}, 'id': 'webhook-gateway', 'ip_range_whitelist': None, 'supports_ephemeral': False, 'protocols': set(), 'rate_limited': True}

@geluk
Copy link
Owner

geluk commented Nov 28, 2020

It looks like you named the bot user @testwebhook:mydomain.tld, is that right? The part before the colon can be anything you like, but the part after it should be your homeserver name, which in most cases will match the homeserver part of your own Matrix user ID (so if you're @someone:myhomeserver.com, the bot should be @testwebhook:myhomeserver.com.

@Dual-0
Copy link
Author

Dual-0 commented Nov 28, 2020

yes that is right I changed it to not write my domain here in public
mydomain.tld stands for the domain configured in the homeserver.yaml and of course this is the same as my matrix user.

@geluk
Copy link
Owner

geluk commented Nov 28, 2020

Okay, understood. Looking through the logs a bit more closely it actually looks like the user namespace regexes do not match up between your appservice registration (appservice.yaml) and your appservice configuration (gateway-config.yaml). If you update any of the settings under app_service: in gateway-config.yaml, a new appservice.yaml will be generated, and you should copy this to your homeserver's appservice configuration again, overwriting the old one.

Something else that's probably causing issues here is that you have user_pattern and sender_localpart set to the same value.
To clarify, in your gateway-config.yaml, the user_pattern key governs what the user ID of a webhook user will look like, while the sender_localpart determines the user ID of the management bot (the one you're trying to invite).

In your case it looks like you have user_pattern and sender_localpart set to the same value. That probably won't work. As far as I know, the management user of an application service must always be separate from the users it tries to invite.

Try setting user_pattern to something different. The default value, user_pattern: '@_hook_{name}_{room}' should work. And of course, copy your appservice.yaml to your Matrix server afterwards.

@Dual-0
Copy link
Author

Dual-0 commented Nov 28, 2020

Okay, understood. Looking through the logs a bit more closely it actually looks like the user namespace regexes do not match up between your appservice registration (appservice.yaml) and your appservice configuration (gateway-config.yaml). If you update any of the settings under app_service: in gateway-config.yaml, a new appservice.yaml will be generated, and you should copy this to your homeserver's appservice configuration again, overwriting the old one.

I've linked the file direct in app_service: :

app_service_config_files:
  - "/opt/matrix-webhook-gateway/appservice.yaml"

Try setting user_pattern to something different. The default value, user_pattern: '@hook{name}_{room}' should work. And of course, copy your appservice.yaml to your Matrix server afterwards.

I think I have already done it.

here is my /opt/matrix-webhook-gateway/gateway-config.yaml:

app_service:
  # The ID of this application service. Must be unique on your homeserver.
  id: webhook-gateway
  # Tokens used to facilitate communication between the appservice and the
  # homeserver. Keep these secret! Randomly generated on first startup.
  hs_token: hasc3somemore
  as_token: nks4jcsomemore
  # All webhook user IDs will be generated from this pattern.
  user_pattern: '@_hook_{name}_{room}'
  # The user ID of the webhook configuration bot.
  sender_localpart: testwebhook
  # Should server rate limits be enforced on the application service?
  rate_limited: true
  # The name of your homeserver. This should match the value of the server_name
  # configuration key in Synapse's homeserver.yaml file.
  homeserver_name: mydomain.tld
  # URL used by the homeserver to communicate with the application service.
  app_service_url: http://127.0.0.1:8023
  # URL used by the application service to communicate with the homeserver.
  homeserver_url: http://127.0.0.1:8008
  # Address and port used to listen for incoming events from the homeserver.
  listen_host: '0.0.0.0'
  listen_port: 8023
  # Display name of the bot user.
  bot_user_name: 'Webhook'
  ...

@geluk
Copy link
Owner

geluk commented Nov 28, 2020

I'm not sure what's wrong in that case. I have the impression that, for whatever reason, Synapse never attempts to send the invitation event to the appservice, but I have no idea why. It could be that there's something misconfigured in appservice.yaml, but I have no idea what that could be.

I'll post my own configuration below, which I know to be working. Perhaps it will be of use to you in spotting a difference that we've overlooked so far.

gateway-config.yaml

app_service:
  id: webhook-gateway
  hs_token: k5x#MASKED#
  as_token: hmh#MASKED#
  user_pattern: '@_hook_{name}'
  sender_localpart: webhook
  rate_limited: true
  homeserver_name: mydomain.tld
  # Both URLs are reverse-proxied, so the ports are missing here
  app_service_url: https://gateway.mydomain.tld
  homeserver_url: https://matrix.mydomain.tld
  listen_host: '0.0.0.0'
  listen_port: 8023

This generates the following appservice.yaml

id: 'webhook-gateway'
hs_token: 'k5x#MASKED#'
as_token: 'hmh#MASKED#'
namespaces:
  users:
    - exclusive: true
      regex: '@_hook_.*'
    - exclusive: true
      regex: '@webhook.mydomain.tld'
  aliases: []
  rooms: []
url: 'https://gateway.mydomain.tld'
sender_localpart: 'webhook'
rate_limited: true
protocols: null

After you send an invitation, you should see the following lines in your homeserver log (I've stripped them down a bit for legibility).

# Your Matrix client sends the invite
"POST /_matrix/client/r0/rooms/!ROOM_ID:mydomain.tld/invite HTTP/1.0"
# Matrix sends the event to the appservice (seems like this never ends up happening?)
Received response to PUT https://gateway.mydomain.tld/transactions/397?access_token=<redacted>: 200
# The gateway bot joins the room
"POST /_matrix/client/r0/join/!ROOM_ID:mydomain.tld?access_token=<redacted> HTTP/1.0"
# Matrix sends the join event to the gateway
Received response to PUT https://gateway.mydomain.tld/transactions/398?access_token=<redacted>: 200

@geluk
Copy link
Owner

geluk commented Nov 28, 2020

I just remembered something else, do you by any chance have registration disabled? (enable_registration: False in your homeserver configuration)

The bot user (unlike the webhook users it creates) needs to actually be registered on the server, so you should temporarily enable registration when the appservice first starts. After that, you can disable it again.

@Dual-0
Copy link
Author

Dual-0 commented Dec 3, 2020

app_service:
Both URLs are reverse-proxied, so the ports are missing here
app_service_url: https://gateway.mydomain.tld
homeserver_url: https://matrix.mydomain.tld

This is crazy, your comment does the trick. My Matrix instance is reverse proxied too but I haven't a "subdomain" for the appservice gateway. So I was starting to search through my homeserver.yaml, There was no appservice port configured by default. The documentation never mentioned that you have to configure the port before you can use appservices. So I find out the matrix-webhook-gateway start this Port 8023. (searched via netstat -tulpen | grep 8023). Is this right? Can I check if the appservice is startet right in matrix?

My listeners section in homeserver.yaml

listeners:
  # TLS-enabled listener: for when matrix traffic is sent directly to synapse.
  #
  # Disabled by default. To enable it, uncomment the following. (Note that you
  # will also need to give Synapse a TLS key and certificate: see the TLS section
  # below.)
  #
  - port: 8448
    bind_addresses: ['::', '0.0.0.0']
    type: http
    tls: true
    x_forwarded: false
    resources:
      - names: [federation]
        compress: false

  # enable metrics
  - port: 9092
    type: metrics
    bind_addresses: ['10.0.0.8']

  # Unsecure HTTP listener: for when matrix traffic passes through a reverse proxy
  # that unwraps TLS.
  #
  # If you plan to use a reverse proxy, please see
  # https://github.com/matrix-org/synapse/blob/master/docs/reverse_proxy.md.
  #
  - port: 8008
    tls: false
    type: http
    x_forwarded: true
    bind_addresses: ['127.0.0.1', '10.0.0.8']
    resources:
      - names: [client]
        compress: true

The bot user (unlike the webhook users it creates) needs to actually be registered on the server, so you should temporarily enable registration when the appservice first starts. After that, you can disable it again.

The webhook user (in my case "testwebhook") was created with enable_registration: False. I checked this in my postgreSQL DB.

@geluk
Copy link
Owner

geluk commented Dec 4, 2020

So I find out the matrix-webhook-gateway start this Port 8023. (searched via netstat -tulpen | grep 8023). Is this right? Can I check if the appservice is startet right in matrix?

Yes, that is correct, however, you don't need to configure anything in your homeserver.yaml. To clarify how this works: Since both Synapse and the webhook gateway need to send events to each other at arbitrary intervals, both the gateway and Synapse need to be able to reach each other. So Synapse listens on port 8008 (the client-server API, which is used by Matrix clients, and is also used by the gateway when it needs to send events to Synapse), and the gateway listens on port 8023, so when Matrix needs to send an event to the gateway, it will communicate with the gateway on port 8023 (It connects to the host and port specified in the url key in appservice.yaml)

In addition to this, the gateway also listens for incoming webhooks on another port, 8020. This isn't relevant to us right now, but for the sake of completeness I'll mention this as well.

So, in my case, I have Matrix running on its own server, listening on port 8008, with a reverse proxy pointing https://matrix.mydomain.tld to http://127.0.0.1:8008.
On another server, I am hosting the gateway, with a reverse proxy pointing https://gateway.mydomain.tld to http://127.0.0.1:8023.

Your logs show that Synapse is configured to connect to the gateway on http://127.0.0.1:8023, so provided Synapse is able to reach the gateway at that URL, it should work. You can test this manually to confirm if that is indeed the case. With the webhook gateway started at the highest verbosity level (-vvv), all requests to the gateway will be logged, so you can try making a request to check if the gateway can be reached. In your case that would be:

curl -v http://127.0.0.1:8023

If the gateway is reachable, it will return a 404 (since it only allows submission of Matrix events on this endpoint), which should look like this:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>Cannot GET /</pre>
</body>
</html>

It should also generate the following message in the gateway log:

SILLY [bridge] 10.190.31.10 - - [04/Dec/2020:00:40:59 +0000] "GET / HTTP/1.0" 404 139 "-" "curl/7.64.0"

I hope this clarifies it a bit!

The webhook user (in my case "testwebhook") was created with enable_registration: False. I checked this in my postgreSQL DB.

That is good to know, thanks! I remember this being a problem on earlier releases, seems like it has been fixed since then.

@Dual-0
Copy link
Author

Dual-0 commented Dec 5, 2020

I started testing this morning:

root@COM:/opt/matrix-webhook-gateway# sudo npm run start

> webhook-gateway@0.0.0 start /opt/matrix-webhook-gateway
> ts-node --compiler ttypescript entry.ts -v

2020-12-05 10:55:24.688.000      DEBUG  [webhook-srv]           Loading configuration file: './gateway-config.yaml'
2020-12-05 10:55:24.722.000      DEBUG  [webhook-srv]           Generating appservice.yaml
2020-12-05 10:55:24.726.000      DEBUG  [webhook-srv]           Creating DB connection with sqlite3
2020-12-05 10:55:24.846.000      DEBUG  [webhook-srv]           There are no pending migrations
2020-12-05 10:55:24.865.000      INFO   [webhook-srv]           Matrix bridge running on 0.0.0.0:8023
2020-12-05 10:55:24.911.000      WARN   [bridge]                [-] POST http://127.0.0.1:8008/_matrix/client/r0/register (AS) HTTP 400 Error: "{\"errcode\":\"M_USER_IN_USE\",\"error\":\"User ID already taken.\"}"
2020-12-05 10:55:24.949.000      DEBUG  [webhook-srv]           Loading plugin from /opt/matrix-webhook-gateway/plugins/__cache/fb4bb74a5bec22dfe1beab71e0e85c195ca111ad432e85b15f0b129393d16eab.js
2020-12-05 10:55:24.955.000      DEBUG  [webhook-srv]           Loading plugin from /opt/matrix-webhook-gateway/plugins/__cache/33f495809f8236b21237c7e3556aba0851f07b9ac517b80ec57d98eab31f36a6.js
2020-12-05 10:55:24.959.000      INFO   [webhook-srv]           Loaded plugins: prometheus, sample
2020-12-05 10:55:24.962.000      INFO   [webhook-srv]           Web server running on 0.0.0.0:8020

Looks like the users is creating again... I don't no if its right:
"User ID already taken."

root@COM:~# curl -v http://127.0.0.1:8023
*   Trying 127.0.0.1:8023...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8023 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8023
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< X-Powered-By: Express
< Content-Security-Policy: default-src 'none'
< X-Content-Type-Options: nosniff
< Content-Type: text/html; charset=utf-8
< Content-Length: 139
< Date: Sat, 05 Dec 2020 10:59:27 GMT
< Connection: keep-alive
< Keep-Alive: timeout=5
<
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>Cannot GET /</pre>
</body>
</html>
* Connection #0 to host 127.0.0.1 left intact

I think the curl worked.

@geluk
Copy link
Owner

geluk commented Dec 5, 2020

Looks like the users is creating again... I don't no if its right:
"User ID already taken."

That's correct. It's a bit of a quirk in the appservice library, which will always try to register the appservice user even if it already exists, so you can safely ignore this warning.

@Dual-0
Copy link
Author

Dual-0 commented Dec 6, 2020

After 2 weeks of testing, searching in git issues and testing again it worked.

The solution is written in here:
matrix-org/synapse#1834 (comment)

would be great if someone can explain it to me. I didn't understand this failure and I think it would be a good for anybody who is searching for this issue.

@Dual-0 Dual-0 closed this as completed Dec 6, 2020
@geluk
Copy link
Owner

geluk commented Dec 6, 2020

Good to hear you were able to figure it out after all, that looks like a pretty nasty issue.
I don't believe there is much I could have done here to make this easier to troubleshoot, but if you have any suggestions, let me know. I'd be happy to implement them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants