Identify the posts
I am using this kind of query on the database:
select
regexp_matches(raw, 'http://(?:www\.|)forumtopics.com/busobj/([\w\d\.?=]*?)(?:\d*?)', 'g'),
count(1)
from posts
group by 1
order by 2 desc;
Here is a table to list my findings
regexp_matches | count |
---|---|
{viewtopic.php?t=} | 16084 |
{viewtopic.php?p=} | 4293 |
{faq.php?mode=rules} | 2193 |
{search.php} | 1123 |
{faq.php?mode=bbcode} | 810 |
{viewforum.php?f=} | 354 |
{images} | 262 |
{faq.php?mode=ask} | 192 |
{shop.php} | 63 |
{profile.php?mode=viewprofile} | 59 |
{files} | 47 |
{printview.php?t=} | 43 |
{templates} | 42 |
{repository} | 37 |
{search.php?mode=results} | 28 |
{index.php} | 12 |
{contact.php} | 12 |
{download.php?id=} | 10 |
{downloads} | 10 |
(update: change to a regex function more powerful)
Some points:
- If there is less than 10 occurences then I will do it manually probably. Let’s focus on top of the list.
- I understand that
viewtopic.php?t=
is linking to atopic id
I am fine with that. Butviewtopic.php?p=
is linking topost id
??