putting stupid search engine urls back
Gabriel Ramuglia
gabe at vtunnel.com
Sun Aug 30 08:10:28 MSD 2009
I would recommend (if you have windows) "the regex coach". In the top
you put your regex, in the bottom your text to match, and it'll show
you what is going on. My first guess is that nginx regex syntax is
different from perl, but I could also have made a mistake or two.
The regex I wrote, in human-speak, is:
$url =~ s#/[^/].*?([0-9]).*\.html$#/showthread.php?t=$1#i;
$url =~ s
(substitution regex against $url variable)
# # #i;
using # as the delimiter since we're using slash in our regex, case
insensitive matching
/[^/].*
start with a slash, then match every character afterwards that is not
a slash, but as few characters as possible
([0-9]).*
ok that was an error of mine, it should have read:
([0-9].*)
which is, match a group of characters whose values can contain digits
0-9, which will be stored in $1, as many characters as you can find
that match.
\.html$
then the period character, then "html", at the end of the line ($
means end of line)
#/showthread.php?t=$1#
replace what we matched in the first part with what follows:
/showthread.php?t=
and then whatever we matched earlier in ([0-9].*)
So the new regex after fixing my error would be:
$url =~ s#/[^/].*?([0-9].*)\.html$#/showthread.php?t=$1#i;
In perl anyway.
On Sat, Aug 29, 2009 at 7:58 PM, AMP Admin<admin at ampprod.com> wrote:
> Well, I coulndn't get that to work. I'm not too good with regex stuff.
>
> Anyone wanna give me an assist on the following?
>
> I need
> http://www.example.com/anyone-doing-late-ridei-t19640.html
>
> to go to
> http://www.example.com/showthread.php?t=19640
>
>
>
> -----Original Message-----
> From: owner-nginx at sysoev.ru [mailto:owner-nginx at sysoev.ru] On Behalf Of AMP
> Admin
> Sent: Saturday, August 29, 2009 9:13 PM
> To: nginx at sysoev.ru
> Subject: RE: putting stupid search engine urls back
>
> There's no winning with you is there Gabriel... jk haha
>
> You have a valid point there. It sucks managing over a half a million links
> on a site.
>
> I'll give your regex a try and see if it works in nginx.
>
> Thanks for all your replies. :)
>
> -----Original Message-----
> From: owner-nginx at sysoev.ru [mailto:owner-nginx at sysoev.ru] On Behalf Of
> Gabriel Ramuglia
> Sent: Saturday, August 29, 2009 8:11 PM
> To: nginx at sysoev.ru
> Subject: Re: putting stupid search engine urls back
>
> Well... first off, any major change to software like that is going to
> change the url patterns and break your links, whether you're using
> friendly looking links or showthread kind of links.
>
> More to your question, it looks like a rewrite or redirect is your
> answer to that question ;)
>
> A regex like
>
> $url =~ s#/[^/].*?([0-9]).*\.html$#/showthread.php?t=$1#i;
>
> would do the trick in perl.
>
>
> On Sat, Aug 29, 2009 at 4:52 PM, AMP Admin<admin at ampprod.com> wrote:
>> Hi again Gabriel,
>>
>> Thank you for your reply.
>>
>> I think SEF URL's are debatable. The more research I do the more I find
>> them unnecessary. It also makes it difficult for growth. What if you
>> change your cms, forum or whatever and the dynamic content is generated
>> differently? All of your "static lookalike" links are no longer valid. I
>> also believe this to be true with major revisions of current cms, forums
> and
>> other. If they do a complete code rewrite as many of them claim with
> major
>> revisions then that might also affect urls too. Eventually you have more
>> lines of rewrites and garbage in your config than anything else which is
> not
>> the intended use. It may not strain the server but every little bit of
>> helps and if someone thinks they can just let that go then where
> does
>> it stop?
>>
>> Anyway, here's an old article that's worth a read in regards to the great
>> sef debate. http://www.emediawire.com/releases/2005/4/emw232456.htm
>>
>> Finally, that brings me back to my original question...
>> Finally, that brings me to my original question...
>> How can I turn:
>> /anyone-doing-late-ridei-t19640.html
>>
>> into:
>> /showthread.php?t=19640
>>
>>
>> ====== CONFIDENTIALITY NOTICE ======
>> NOTICE: This e-mail message and all attachments transmitted with it may
>> contain legally privileged and confidential information intended solely
> for
>> the use of the addressee. If the reader of this message is not the
> intended
>> recipient, you are hereby notified that any reading, dissemination,
>> distribution, copying, or other use of this message or its attachments is
>> strictly prohibited. If you have received this message in error, please
>> notify the sender immediately and delete this message from your system.
>> Thank you.
>>
>>
>> -----Original Message-----
>> From: owner-nginx at sysoev.ru [mailto:owner-nginx at sysoev.ru] On Behalf Of
>> Gabriel Ramuglia
>> Sent: Saturday, August 29, 2009 4:10 PM
>> To: nginx at sysoev.ru
>> Subject: Re: putting stupid search engine urls back
>>
>> You really shouldn't have to use redirects, I would think. A rewrite
>> (mod rewrite in apache, I think nginx supports similar), will allow
>> people to directly access the friendly urls while your application
>> internally receives a request for the ?something=something urls. If
>> rewrites are using up so much cpu that you'd rather not have search
>> engine traffic than have to take the cpu hit to rewrite urls....
>> something is seriously wrong.
>>
>> On Sat, Aug 29, 2009 at 8:27 AM, AMP Admin<admin at ampprod.com> wrote:
>>> It's just that it seems to put a strain on the server to redirect
>>> everything. Maybe I'm doing it wrong. Also, when people submit content
>> to
>>> bookmarks sometimes they get a redirect error. It says something like
>> 'this
>>> page redirects to' and then it won't bookmark.
>>>
>>> If I can make a nice clean sef that doesn't cause problems like that then
>> I
>>> would love to use it.
>>>
>>> Also, if we move to a new platform a year or two down the line then those
>>> links woud need another redirect if the new system uses a different url.
>>> Does that make sense?
>>>
>>>
>>> -----Original Message-----
>>> From: owner-nginx at sysoev.ru [mailto:owner-nginx at sysoev.ru] On Behalf Of
>>> Gabriel Ramuglia
>>> Sent: Saturday, August 29, 2009 12:15 AM
>>> To: nginx at sysoev.ru
>>> Subject: Re: putting stupid search engine urls back
>>>
>>> If you do that, you're wasting SEO potential. If nothing else, the
>>> search engines take into account textual content in your urls when
>>> considering the topicality of the page. Without those keywords in the
>>> url, you'll have a harder time ranking for the relevant topics of your
>>> site.
>>>
>>> On Fri, Aug 28, 2009 at 9:43 PM, Ilan Berkner<iberkner at gmail.com> wrote:
>>>> Perhaps not the appropriate forum... but why / where did you hear that
> it
>>>> was "stupid" to do / use SEF?
>>>>
>>>> Thanks
>>>>
>>>> On Sat, Aug 29, 2009 at 12:36 AM, AMP Admin <admin at ampprod.com> wrote:
>>>>>
>>>>> A few years we went along with the buzz about writing search engine
>>>>> friendly urls. Well now I think that’s stupid and believe the site
> will
>>> get
>>>>> crawled regardless.
>>>>>
>>>>>
>>>>>
>>>>> Anyway, there’s bots and ppl still looking for the sef urls so I need
> to
>>>>> change them back.
>>>>>
>>>>>
>>>>>
>>>>> How can I make:
>>>>>
>>>>> /anyone-doing-late-ridei-t19640.html
>>>>>
>>>>>
>>>>>
>>>>> into:
>>>>>
>>>>> /showthread.php?t=19640
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
>
More information about the nginx
mailing list