Up to this point, I had a working up-to-date federated lemmy instance.
Everything was working perfectly.
By default, my instance could only list local content.
Remote communities only get listed once a local user has search for them, furthermore if they have subscribed.
The issue with this approach is how can I discover new communities to subscribe to?
On Mastodon one "subscribes to" users (rather follows), which allows them to see those users' threads involving other users which can in turned be followed.
On Lemmy, we subscribe to communities, and in this case other communities are almost never brought up naturally.
The only thing I could do was go on lemmy.ml, list all publications, look for interesting communities, search for them on my instance and then subscribe.
But there was a solution: I knew that searching for a federated community from my instance would get it listed on mine.
Why not search for all of the communities of a federated instance so that they all get listed on mine?
So I came up with a little script.
I talked about it with an admin of lemmy.ml beforehand as I did not want to do something wrong and somehow get my instance blacklisted.
They said it was ok, as long as I don’t DDOS their instance.
I think I’ll be okay, I don’t want that, neither could I do it.
Here’s the help message from the script:
usage: lemmy-federator.py [-h] --source LEMMY_SOURCE_INSTANCE --destination LEMMY_DESTINATION_INSTANCE [--communities-batch-size COMMUNITIES_BATCH_SIZE]
Search a Lemmy instance's communities from another one so as to get them listed
-h, --help show this help message and exit
The instance communities are being pulled from ("lemmy.ml" for example)
The instance communities are being retrieved to ("mylemmy.com" for example)
Number of communities to pull on this batch
--no-dry-run Set this flag to actually do the stuff
The script has two dependencies: BeautifulSoup and requests which can be easily installed with a
pip install bs4 requests (although I rather use virtualenvs)
It lists all communities from the source instance, does a search on the destination instance for each one and stores it in a
cache.txt file in the current directory so that we can ignore them next time the script is ran.
The script will only process a defined number of new communities (
COMMUNITIES_BATCH_SIZE) at each run, which defaults to 15.
The only thing left to do is to schedule it at first every fifteen minutes for example, until all communities are retrieved.
Then only once a day, or even once every few days should be enough, I don’t think there are a lot of new communities created everyday.
Here’s what I put in my crontab on a server for a daily execution a 3:29 PM
29 15 * * * cd /lemmy-federator/directory/path/;/usr/bin/python3 /lemmy-federator/directory/path/lemmy-federator.py --source lemmy.ml --destination lemmy.coupou.fr --no-dry-run
For now it works I imagined it would, I still have to check if the retrieved communities don’t get somehow deleted if they are never used (if nobody subscribes).
If that’s the case I would have to come up with a solution but I’ll worry about that when and if the issue arises.
As a reminder I’m not a developper, there must be a lot of room for improvement in the script but it works for me, that’s enough.
I wanted to leverage the Lemmy API at first but I could never understand how to use it.
Must be my lack of developing skills :p
If this script is useful to you of if you improve it in any way let me know!