Skip to content

feat: add a script to bulk mark users as spam retroactively#3521

Open
tefkah wants to merge 3 commits intomainfrom
tfk/bulk-spam
Open

feat: add a script to bulk mark users as spam retroactively#3521
tefkah wants to merge 3 commits intomainfrom
tfk/bulk-spam

Conversation

@tefkah
Copy link
Member

@tefkah tefkah commented Mar 4, 2026

  • feat: add bulk spam labeling script
  • refactor: write analyze results incrementally

Issue(s) Resolved

Still so many spam users!

This PR adds a script that scans every. single. user in PubPub and determines whether they are likely spam.

It does this by checking for a few metrics, which increase the spam score

  • Do all the users comments contain links? +2
  • Is the user not associated to any Community (are a member, have attribution) and have a link in their profile? +2
  • Do they post comments with links, and don't have any memberships? (this alone qualifies someone as spam) +4
  • Do their comments contain common spam phrases ("I like your blog!") and contain a link? +2
  • Do they have a URL in their bio? +2
  • Have they immediately added their website to their profile? +3
  • Do they have a website in their profile, and no memberships? +2
  • Does their profile have some spam phrases (BUY PILLS!!) (depends on the phrase how severe)

By default the script considers you spam if you score more than 4.
I think running it with a threshold of 6 should weed out almost everyone, but can be increased.

Test Plan

Screenshots (if applicable)

Optional

Notes/Context/Gotchas

Supporting Docs

"@types/mime-types": "^2.1.3",
"@types/multer": "^1.4.9",
"@types/node": "^18.11.4",
"@types/node": "^24.4.0",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wild that we were still on such an old one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant