: Lock all usernames from imported XML dump I’m forking a wiki, therefore I have imported an XML dump in a fresh MediaWiki installation. This dump also contains revisions and user pages (in
I’m forking a wiki, therefore I have imported an XML dump in a fresh MediaWiki installation.
This dump also contains revisions and user pages (in the User: namespace), which I want to keep. So there are many links that lead to (existent as well as non-existent) user pages.
However, as I don’t have access to the user database of the original wiki, I have no chance to reserve these usernames for their original owners. So to prevent that someone registers in my wiki with a username that was used in the original wiki (so this new user would automatically own all revisions done by the old user with the same name), I want to lock all previously used usernames.
How could I do this?
As there are more than hundred users, I don’t want to solve this manually.
More posts by @Sherry384
1 Comments
Sorted by latest first Latest Oldest Best
You can use the unix tools grep, sed, sort, and uniq to pull out all the user names from the dump.
Each users has a page with <title>User:User Name</title> in the dump. You can pull them all out with these commands:
grep '<title>User:' -- Pull out just the titles of the user pages
sed 's/.*User://g;s|</title>||g;s/r//g' -- Strip it down to just the user name
sort -- alphabetize them
uniq -- remove duplicates (it is a history file)
perl -p -e "s/n/','/g" -- replace the newlines with ',' to make it easy to stick in the $wgReservedUsernames array
Putting it all together:
grep '<title>User:' my-wiki-dump-history.xml | sed 's/.*User://g;s|</title>||g;s/r//g' | sort | uniq | perl -p -e "s/n/','/g"
I tested this by downloading a dump from archive.org and testing against the included history XML file. A 500MB file should pose no problems for this method.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.