Automation
There are several tasks that could be automated for an easier maintenance of the provider list.
Bots
All bots except the scheduler and the GitLab bot have the following tasks:
-
Update providers.json
Scheduler
-
Define time periods for running all bots
XMPP
-
Login and unauthenticated requests -
Use XEP-0077: In-Band Registration to check IBR support and whether CAPTCHAs (as new basic information) are required -
Use XEP-0080: User Location via countrycode (list within countrycode
or multiplecountrycode
s for more multiple locations) to retrieve the propertyserverLocations
- not supported by ejabberd and Prosody -
Use XEP-0092: Software Version to check the software version (or crawl that via the Compliance Tester website) -
Use XEP-0157: Contact Addresses for XMPP Services to retrieve contact addresses automatically -
Use XEP-0309: Service Directories to retrieve additional data that cannot automatically be retrieved from other sources (e.g., a registration web page) Update: That XEP is used for servers doing what we do without a server running the whole time. The XEP is not intended to be used by clients to retrieve data but between a server gathering information about providers and the providers' servers. Thus, it is not suitable for this project) -
https://modules.prosody.im/mod_service_directories.html -
vCards profile: What and how to use it here (e.g., whether and when accounts are automatically deleted if they are inactive)? -
Attempt: XEP for JSON / XML file containing all properties that cannot automatically be retrieved from other sources
-
-
Use XEP-0363: HTTP File Upload in combination with XEP-0128: Service Discovery Extensions for maximumHttpFileUploadFileSize
-
-
Join data in a dictionary / dataclass -
Storage in a JSON file (interoperable format between bots)
Web (REST)
Compliance Tester
- maybe use a general name such as
compliance
) - Scheduler must trigger the compliance service to run its test (simulated button click) and trigger the retrieval of the result after a certain time (e.g. 1 hour)
IM Observatory
- successor without rating: https://inspect.xmpp.net
- maybe use a general name such as
security
- maybe run
testssl.sh
by our own, set rating inproviders.json
and link to successor of IM Observatory https://inspect.xmpp.net
Uptime
- Test suite: https://github.com/horazont/xmpp-blackbox-exporter
- Example: https://observe.jabber.network/test/uptime/zombofant.net (https://observe.jabber.network/test/uptime/)
-
t0
: beginning of measurements - Entries of
uptime_history
: hourly uptime average (0: always down; 1: always up) by testing uptime every minute
-
- Tasks done by us
- Send message to provider via XMPP and email support bots in order to obtain consent for uptime tracking (opt-in); alternative: providers provide their consent by service discovery
- Process response
- Update new property (e.g.,
"uptime-consent": true
) inproviders.json
- Create simple list of providers to check out of
providers.json
and publish it on data.xmpp.net - Send message to service if provider opts out later in order to stop uptime tracking quickly
- Request uptime values of
uptime_history
daily - Update average uptime property (e.g.,
"uptime": 0.8
) inproviders.json
by retrieving average uptime of last 24 hours and time frame of the stored values
- Tasks done by uptime service
- Compare new list of providers to check on data.xmpp.net with old list on observe.jabber.network (manually done)
- On changes, add or remove providers
- On receiving message when provider opts out, remove provider directly
- Provide API for each provider's uptime of last 24 hours
GitLab (REST) via GitLab's REST API
-
File diff -
Updates lastCheck
-
Create beautiful merge requests: Documentation -
Trigger IM Observatory tests https://xmpp.net/result.php?domain=example.org&type=server
andhttps://xmpp.net/result.php?domain=example.org&type=client
automatically -
As soon as other bot's run is finished, the GitLab bot is called by the scheduler, it makes a git diff
and if there are changes, creates a commit and a merge request via GitLab's REST API
XMPP Support
-
Trivial testing of working server contact address -
Address verification response -
Storage in a JSON file (interoperable format between bots) -
OPTIONAL: Send XMPP messages to the support addresses of a provider (or separately provided addresses by the admins) when the provider's category changes
Email Support
- Sending and receiving emails with Python
-
Trivial testing of working server contact address -
Address verification response -
Storage in a JSON file (interoperable format between bots) -
Status updates to providers -
OPTIONAL: Send email to the support addresses of a provider (or separately provided addresses by the admins) when the provider's category changes
Further ideas
-
Generate and provide up-to-date filtered lists on web server (e.g., on subdomain of https://data.xmpp.net): XSF Infrastructure issue -
Run XMPP Compliance Tester and update provider list with results -
Create accounts on all providers -
Extend tester by MIX support for clients supporting it -
Run bot that check all providers once a day, creates merge requests if something changes and updates lastCheck
-
-
Run IM Observatory and update provider list with results -
Run bot that checks all providers once a day, creates merge requests if something changes and updates lastCheck
-
-
Provide badges on a server so that providers can embed them on their websites -
After clicking on the badge, the user is redirected to a web page on the badge server with details to all criteria and the reason why the provider is in a specific category
-
General considerations
-
Software architecture: Do we need our own data processing component?
-
Pipeline integration
-
Bots should focus and retrieve the change of data
-
Where should the bots run? --> Move towards a separate repository?
-
Draw entire architecture (https://app.diagrams.net/)
-
Draw Gantt chart (https://app.diagrams.net/)
-
Use artifacts from GitHub setup / configuration
-
XMPP bots will just download the
providers.json
for the beginning -
Do we need to involved other people to change server software and XEP standards regarding e.g. HTTP Upload and dealing with XEP capabilities?
-
Target: Unauthenticated access (to avoid maintaining accounts)
-
Run own services?
- Does KDE allow us?
-
Which properties should / can be automatically retrieved from the XMPP server?
-
How must / should the properties be provided by the XMPP servers in order to process them?
-
Split work into separate milestones
-
How is the server support for providing each property at the moment?
-
Are accounts on each XMPP server needed to retrieve the data?
- Can the data be retrieved via an unauthorized XMPP connection or even only via HTTPS?
-
How to handle missing data (Which fallback should be used)?
-
How should retrieved data update the provider list?
- When should the data update the provider list?
- When is a manual review needed before updating the list?
-
How to ensure that the retrieved data is also displayed (and equal) at the provider's website?
- The goal should be to have exactly one source (e.g., the server's configuration file)
- How can providers embed that data in their website (HTML / JSON provided by XMPP Providers, layout / theme issues, separate page with own theme)?
-
How can automatically created results for rankings provide a corresponding result page?
- That would be easy if the tests could be run regularly on their existing websites and must only be checked on a regular basis via an API or by crawling their websites
- If we run our own tests, we need accounts on each server (even for those with already closed registration), a server running those tests and a website for providing the results
-
What is the data format and data collection we are going to deal with?
- Format will be critical if we change the json format (keys?) in the future.
-
What happens if open MRs experience changes while review is going on?
-
How do we move from the old format of providing information and move over to the new automated process?
- Flag manually and automatically generated
- Do not overwrite when there is no automated info (10 attempts?)
- Don't proceed two ways - let's inform but then the bots will go live from a specific spot
- 1st phase: Pure automated information processed
- 2nd phase: Also manually information will be processed