Identify the posts
I am using this kind of query on the database:
select
regexp_matches(raw, 'http://(?:www\.|)forumtopics.com/busobj/([\w\d\.?=]*?)(?:\d*?)', 'g'),
count(1)
from posts
group by 1
order by 2 desc;
Here is a table to list my findings
| regexp_matches | count |
|---|---|
| {viewtopic.php?t=} | 16084 |
| {viewtopic.php?p=} | 4293 |
| {faq.php?mode=rules} | 2193 |
| {search.php} | 1123 |
| {faq.php?mode=bbcode} | 810 |
| {viewforum.php?f=} | 354 |
| {images} | 262 |
| {faq.php?mode=ask} | 192 |
| {shop.php} | 63 |
| {profile.php?mode=viewprofile} | 59 |
| {files} | 47 |
| {printview.php?t=} | 43 |
| {templates} | 42 |
| {repository} | 37 |
| {search.php?mode=results} | 28 |
| {index.php} | 12 |
| {contact.php} | 12 |
| {download.php?id=} | 10 |
| {downloads} | 10 |
(update: change to a regex function more powerful)
Some points:
- If there is less than 10 occurences then I will do it manually probably. Let’s focus on top of the list.
- I understand that
viewtopic.php?t=is linking to atopic idI am fine with that. Butviewtopic.php?p=is linking topost id??