You don't need to be an 'investor' to invest in Singletrack: 6 days left: 95% of target - Find out more
Bit, well a lot, of a rant at the forum I'm afraid.
By and large I love this place and what it offers across a broad range of topics (apart from politics 😉 )
But the bloody search function is useless.
Even using Google outside of the forum still seems to index search terms with the completely wrong date, showing 7 yr old threads as 7 day old etc.
Doesn't seem to be a way around it and I had no luck searching to see if there is a fix, which is not surprising given this rant.
Am I doing something wrong or missing a trick - really can't understand how it can be so pants yet should be a powerful tool for both new and old visitors.
You're right. It's shit.
seems to index search terms with the completely wrong date, showing 7 yr old threads as 7 day old etc.
It fools me every time.
Agree. It’s total bobbins to use. I don’t get it.
I search Singletrack on Google for all kinds of stuff because I know there's well rounded advice on here. Almost always I give up because it's impossible to find recent content, even when you know it exists. It seems an absolutely catastrophic failure to have so much data and have non of it indexed in any usable or meaningful way.
As an experiment I tried to do a search for this thread, I couldn't find it at all but I did find this:
https://singletrackmag.com/forum/topic/wtf-is-wrong-with-search-and-recent-howgills-thread/
Impressively it isn't 11 years old.
A good search facility would mean fewer new topics created?
I was considering this exact question earlier, for the Google results at least I think it may be down to the lack of proper time stamps on posts and so it just records when a bot last scraped the data. Every other forum search I do works and they have time stamps so I can only assume that's the cause.
The search on here never worked. Google used to be ok but something broke a while back and now that is messed up too. It is a pain but I quite like the transient nature of posts it creates now. It's really annoying on other forums where people just keep referring back to older threads (often with broken links).
I was looking for recommendations for cameras earlier today. Lots of threads that are just a few days old from ten years ago or more. Not much use!
Interesting looking at all the old forum names to see who is still around and who has gone.
There's 10 million entries in the forum post table. The default search, which we have deactivated, ploughs through that massive database every time any user enters a search term. It doesn't take more than a few searches running together to bring the site to a crawl - and that's with 12 CPU cores and a separate 4 cores just to run the database itself. That's why our solution, which was until recently working reasonably well, was to use a Google search product that uses Google's indexed version of our forum for search. That Google search script uses the existing Google index of all our forum pages but in order to use this service we have to either pay for each user search query or allow Google to run sponsored results at the top of the results page.
Although set to run in either of two modes, Most Relevant or by date (you can pick which one on the search page) currently it appears that the google index of our forum is registering updates in the RH column as page updates. That's why at the moment it's returning silly results like posts from 8 years ago in latest date mode.
There's little we can do ourselves to solve that issue. It seems to be a recent Google change that's borked it.
Search is a task that is on our list. We have a plan to build our own separate index of the forum using something called Elastic Search. We have a licence for its use and we plan to build a new forum search system from it. This is not a trivial task however. Creating a searchable index table of the entire forum while keeping the site and the forum running 24/7 is tricky and will probably take weeks to do as we will need to throttle the rate at which it works its way through those 10 million database entries (One for every post and reply).
That's where we are at currently. I'll keep you updated as to how that project goes. In the meantime we are looking at a fix/patch/bodge to try and get better results from the Google powered search that's there at the moment.
Interesting looking at all the old forum names to see who is still around and who has gone.
Yup, a search often results in a trip down memory lane.
As an experiment I tried to do a search for this thread, I couldn’t find it at all but I did find this:
WTF is wrong with Search [ and Recent Howgills thread]
Thats quite funny, as that was my thread, which i couldn't find the other day...
I do so hope they manage to fix it, I want to renew my membership...
currently it appears that the google index of our forum is registering updates in the RH column as page updates
Must be a way round that surely can't be the only site to work this way?
There’s little we can do ourselves to solve that issue. It seems to be a recent Google change that’s borked it.
It's been this way for years, though I seem to remember it only affected results prior to 2018 until recently.
I'd be very surprised if it was unfixable. The problem seems entirely unique to singletrackworld in my experience and it's not desirable behaviour for Google. There'll be ways and means to address it.
I love the forum. I use the forum a lot. It’s my go-to for lots of info, not only about biking. But it is the most frustrating forum I’ve ever used. At least there is a like button now. I’m sorry to sound so unappreciative. But the functionality is really naff at times. A lot of times. Maybe it doesn’t need to be any better, since lots of us seem to use it anyway.
@mark can't you just give us all a few index cards and a date range?
Serious answer - I know for me it would be much, much, better if there was someway of restricting at least initial search results to a single result per thread such that once theres a hit in a thread it's dropped to the bottom of the relevance list.
The biggest bugbear for me is if I search for eg "snowdon" the first hundred results will be from a single thread discussing using Yr wyddfa (Probably from 8 years ago) and nothing else despite hundreds of threads which would be at least as appropriate.
Personally, restricting search results to the first post only in the first instance and then when it's indexed a tick box to search in threads would be a huge boon. It would also mean you could have a usable search before you've indexed 10 million replies SS you'd only need to index the first post. (obviously I don't have a clue if that actually makes it harder and more complex but it's Friday and I've had beer so I'll speak like an authority)
We've only been using the Google search function since January.
The default Wordpress site search simply can't cope with the size of our database.
Other websites our size can often have really good search systems. None are easy or cheap to implement. If you think about it, what we are trying to build is our own version of google restricted to just our forum. It's not hard to find anything in the database. What's hard is finding it while not slowing the rest of the site down. That's why the ultimate solution is to build a separate table that contains just the content from every post that is to be searched. Searches would then be made on that index table rather than the live post table that. A separate, stripped down duplicate of the forum if you will, that is created solely for searching. That's what an index is. A separate database table that is also an almost live copy of the forum so that if you search for something you posted a minute ago it would turn up in the results. It's not a simple task for 1.5 developers. But like I say, we are getting to it.
Since we put the current Google forum search live there's been 82k searches made.
Here's the top searches this month.
crc37
orbea rise31
mavic wheels17
secan15
capra14
tubeless tape14
marley13
83013
internal doors13
inguinal13
ibis ripmo af13
herniated disc13
specialized epic evo12
redlands12
bafang12
balding12
mezzer12
santa cruz stigmata12
laptop11
komoot11
And for those interested, type 'why is wordpress search so bad?' into Google.
Or have a look at this (one of many, many references to the problem). https://blog.cmbr.co/wordpress-search-sucks-here-are-the-best-solutions-e1b0555030ff Note the recommendation of Elastic search at the bottom. The other seemingly simple solutions on that list are just enhancements of the existing search and result in the same issues when dealing with a forum database the size of ours.
That's obviously bolox mark, over half those search terms are bike related, not one is about wood burner or coffee and I only had to Google a single term to see what it actually meant/if it was a real word, though having done so I'm not surprised to see it on the list, even if it is lower than I'd expect.
You've made the whole list up haven't you?
currently it appears that the google index of our forum is registering updates in the RH column as page updates. That’s why at the moment it’s returning silly results like posts from 8 years ago in latest date mode.
There’s little we can do ourselves to solve that issue. It seems to be a recent Google change that’s borked it.
@mark Get your tech peeps to look at https://developers.google.com/search/docs/appearance/structured-data/article and https://developers.google.com/search/blog/2019/03/help-google-search-know-best-date-for
We’ve only been using the Google search function since January.
As a user, I've attempted to use the site search maybe a handful of times in I don't know how many years. I can't speak for other people on this, but I've come to assume that internal search on most sites are junk, so I default to the familiarity and reliability of Google, which in most cases does the job much better anyway. A lot of the time I'm browsing general topics and looking for results from a number of sites, so it's more convenient too.
In this case though, the experience from Google is so bad that it doesn't just affect regular users: for many people that will be their first experience of singletrackworld. It seems to me the primary issue is Google search. If an updated internal search fixes that, great. If not, it's going to make little difference to me and the thousands of other people who access the forum via Google.
We score 98% for structured data on the site. Like I said, searching and finding stuff is easy. It’s just a bloody great table of text at the end of the day. It’s searching through a live database that is being continually accessed and new data added to it constantly. There’s over 300 people looking at this forum right now at 11pm on a Friday night.
the solution is building our own index that can be organised and searched separate from the live database. That’s what Elastic search does, but it isn’t a plug and play tool so it’s going to take some time.
From an business perspective is this lack of meaningful searchability leaking you potential money? A database of 10 million must contain a hell of a lot of junk but within that there must be some (rough) diamonds. That's quite the asset. Does a successfully searched for a thread from the archive brought back and read now gain you the same potential earnings as some new drivel generated today? I guess as it's surrounded by the same adverts it would. Any guess/clue how much traffic you are missing out by people not bothering to search currently?
Counter to that would be if there was a sane way to search back you'd discover that most everything worth asking has already been answered and it might curiously dissuade new discussion.....
I hope yo can get it to work Mark - not just for the searching for us but surely you can monetise the tech advice in there somehow?
using google.com to search the forum works very well. Casually searching for stuff and ending up on the forum is very common and drives a lot of traffic to us. That’s a separate issue to the one of search the forum from the forum itself. In terms of traffic you are already here in that scenario. Traffic loss is impossible to calculate really, but undoubtedly poor site search doesn’t help and we will lose out. As you say though, it’s a commercial choice what we prioritise for dev time with such a small team and that’s why this particular issue, as frustrating as it is, is not at the very top of the todo list. But it is slowly floating upwards and we will get to it.
We score 98% for structured data on the site.
Ok, but Google isn't picking up the correct date. Got to be worth a little bit of developer time looking into why rather than saying it's a bug in Google.
Thankyou Mark for being so open and coming to the thread to explain.
The reasoning is understandable and I suspect probably ties into one of the forums main features - it is just one big group of 'stuff' instead of segregated forums - like most others are - thus limiting the search function options and instead having one huge chunk to trawl.
Would there be any merit in only allowing say the last 2 years worth of posts to be indexed or 'available for searches' instead of the whole database? At least until a better way to manage the whole thing and not hog resources is found.
I suspect those searching are doing so for relatively new posts and advice.
Not sure how feasible that would be, or practical, but if it could at least get the dates right it would be a good thing.
using google.com to search the forum works very well.
If you're looking for advice from 14 years ago it works perfectly.
I appreciate my comments might come across as overly critical but they're not intended that way. I search for stuff on singletrack almost daily using Google search. I'd estimate about 80% of the time I give up and search for the same info elsewhere as 100% of the results are too far out of date to be any use, despite saying the topics are only days old. It's a massively frustrating experience.
I've just checked on https://validator.schema.org/#url=https%3A%2F%2Fsingletrackworld.com%2Fforum%2Ftopic%2Fwhat-tyres-do-you-use-on-a-cyclocross-bike-when-on-the-road%2F and you are indeed using the correct structured data - so I apologise for my earlier posts suggesting you hadn't looked at this properly. This page shows as 8 days old on a search for 'tyres' yet the structured data does indeed have it as from 2010. Good luck!
At least there is a like button now.
... for paying members. It's blanked out for free members.
We score 98% for structured data on the site.
Could you expand on that? I doubt the vast majority of readers understand what that means.
Like I said, searching and finding stuff is easy.
Sorry Mark, but it's really not. I've tried to respond to user questions thinking "there was a thread about this a month ago" and search engines either within or without have not borne fruit.
And this isn't new, Google or no. I've been here, what, twelve years? And I've never known a functional site search aside from a Chrome plugin. I appreciate the challenges on a site of this size - and I'm very well versed with Elastic and agree that it may well be the solution - but it is a nonsense to suggest that it is easy to find stuff currently.
There's a recent thread asking about air fryers, there's a slightly less recent thread talking about air fryers, can you find both and link back the latter as a response to the former? I couldn't when I tried, not before I filed it under "more hassle than it's worth in trying to help someone" anyway.
I couldn't find a thread on recumbents from a couple of weeks ago because Google will only serve 10 year old threads (marking them as recent) a simple thread title search would probably return more relavent results than indexing all content which throws up a jumble for some terms (like Snowdon above) I thought about implementing this with a scraper which would probably only take me a day or so but I've no shortage of paid dev work to be on with.
using google.com to search the forum works very well.
Sorry, this is categorically incorrect. It used to work well but now just brings back ancient posts which it thinks are a week or two old.
I dont understand why youre trying to fix the internal search. It's never been any good. Why reinvent the wheel. Just get google search working like what it used to.
Just get google search working like what it used to.
Umm...
He already addressed that above, something changed at Google's end.
He already addressed that above, something changed at Google’s end.
Right, so change something at STW end to sort it. Or get Google to fix it.
No you're right. If google changed something then that's it. Game over man. Let's build a search from scratch thats better than Google. I mean, how hard can it be....
🙄
It used to work well but now just brings back ancient posts which it thinks are a week or two old.
Do we know why? Which field is Google indexing to create this result
This is quite good: https://www.algolia.com
Search on most online forums is not great. Unrealistic to expect awesome for what is essentially a free service.
Google is ignoring the date data specifically embedded in the page source to tell Google publication date, in preference for when it notices updates to the sidebar content which happens regularly.
If this gets fixed does that mean we'll see fewer questions about towbar racks and best xc tyres
Like I said, searching and finding stuff is easy.
@cougar apologies. I should have been clearer. Search on here is awful. What I meant is ‘fundamentally’ search is easy. It’s just matching a word to a document. But it’s not easy when a system has 10 million database entries to search through while remaining functional to the users. I wasn’t suggesting searching on here is easy. It’s clearly not.
its a problem that on the surface feels like it should be easy and I totally get why so many of you are frustrated by it. I hope I’ve explained enough that it is clearer that it’s not an easy fix to do. If it was Wordpress, which runs over 40% of the worlds websites, would surely have made the default search better by now.
in the meantime we are pinning our hopes on elastic search 🙂
Right, so change something at STW end to sort it. Or get Google to fix it.
No you’re right. If google changed something then that’s it. Game over man. Let’s build a search from scratch thats better than Google. I mean, how hard can it be….
🙄
@thegeneralist I'm not saying it's game over however I'm not a dev but even I know how opaque Google's search algorithm code base is, if it wasn't people wouldn't be making money doing SEO work. It's literally a case of "well I know somethings changed but I don't know what so in the absence of further info I'll just have to fling shit and see what sticks". Which always goes so well.
I've disagreed with Mark a lot in the past about stuff but honestly, I don't know any better solution at this point, more so if you don't have to rely on Google and their fickle algorithm.
Yes, a lot of this is down to legacy issues which have created a rod for the forums back but here we are. Short of breaking it down into boards and sifting through a lot of stuff that is probably irrelevant or just archiving the lot and starting again I'm not sure what other option there is.
TBH at this stage I reckon any investment in the forum is a good one, social media (shut up Cougar you know what I mean) has long lost its shine and proven time and time again that platforms are born and die in no time at all - twitter is knackered, Facebook is used by the parents of the early adopters that have long moved on, Reddit is having it's own self created crisis and the rest are either useless for what happens here or just too small or complicated to work.
What I meant is ‘fundamentally’ search is easy. It’s just matching a word to a document. But it’s not easy when a system has 10 million database entries to search through while remaining functional to the users. I wasn’t suggesting searching on here is easy. It’s clearly not.
Sure. And that makes total sense.
With a dataset the size of STW searches are 'expensive' and Elastic absolutely kills at this, I deployed it as part of a suite of software in... something I can't talk about and given sufficient RAM to breathe it was a paradigm shift.
I can't help but wonder though why Google itself copes so badly here. Is there something funky going on that it doesn't like with metadata or something?
I’m not a dev but even I know how opaque Google’s search algorithm code base is...
Which is mostly designed to prevent people gaming the system, pushing less relevant results to the top of the list. As a product, their priority is to provide the most relevant content to users. The issues with the search on this forum are the kind of thing they'll provide plenty of info on to prevent, as the results they're listing are exactly what they don't want, they're completely inaccurate.
Not saying it's easy, but I'm fairly positive it's fixable and I'm surprised it's not top priority given that it's the gateway to the forum.
Maybe, as I say I'm not a dev so have no real idea, I just appreciate it's not necessarily as easy as people think it may be. Or it might be. I dunno.
If you'd done a search you'd see lots of similar threads to this one...oh,...wait 🤣
STW is no different from lots of other 'string and laggy bands' sites when it comes to this. The easy step is just to Google and enter the search with 'singletrack' after it and job jobbed. My employer's website (a university) is just the same - the supposed Google-powered search is crap, so I just end up stepping out to Google as above. That's not really an answer to the OP's question, but the solution renders the question irrelevant.
🤷♂️
@moimoifan say you've not read the thread without saying you've not read the thread 😂
say you’ve not read the thread without saying you’ve not read the thread
Sort of. But not really.
I have quite a good memory so if I'm looking for a specific thread I often remember key words from the title or the OP or recognise the thread from a list of four or five.
If I'm looking more generally to see if a similar question has been asked before (usually maintenance or bike parts), I'm happy to look through a couple.
Stepping outside of STW to Google works just fine for me.
Sorry to disappoint you and your LMAO emoji. Maybe if I want to refer back to this thread in a couple of months time, I could conduct an experiment and type in "Sarcastic numpty trying to point-score singletrackworld" and see if it finds your post?
L
M
A
O
🙄
Maybe the forum/s could be on a different server. Then a good thorough search wouldn’t slow the rest of the site. As per the mood of this thread, I t’s rather frustrating like it is.
Kudos to Mark for explaining though.
Also the member notifications. No one really needs a list of every post in a thread - just one that the thread has been updated. It seems that’s how other forums do it.
Ps “notify me of follow up replies doesn’t work, nor do messages get notified. (Is service that supposed to generate an email?)
Good luck Mark.
I’m happy to look through a couple
Me too but since singletrack towers cocked the timestamp up I’m likely to have to search through significantly more. Or give up.
Moimoifan in this instance squirrelking is completely right.
Surely there is some way of getting Google to play nice, it's hardly a unique website situation to have a dated article with some always updating bits surrounding it. If it can be solved there, it benefits people googling generally and people specifically searching the site.
Search is generally a solved problem, not something I'd expect a bike website to be having to invest in bespoke "build a new forum search system" and running it.
@moimoifan I've Google searched for threads I absolutely know exist and....
Nothing.
Cougar seems to concur and he's got a lot better IT skills than I do.
Mark seems to think so too and he runs the place.
Christ, I couldn't find the bloody Tamiya thread the other week, something is very wrong with how Google is cataloguing this site. Regardless of where the failure lies it's a problem nonetheless.
STW is no different from lots of other ‘string and laggy bands’ sites when it comes to this.
Laggy bands? Laccy, probably. Derived from 'elastic.'
I remember as a small child, innocently owning a game of Eye Spy with "LB," no-one worked out lastic band.
I don't think you're in any position to L your A O about other folks' posts.
Laggy bands? Laccy, probably. Derived from ‘elastic.’
'Laggy' around here - just like 'plaggy' for plastic - same slang etymology, same result.
That's a bit like trying to score points based on the fact that I'd call a bread roll a 'cob' and someone else call it a 'batch' or a 'barm'.
We're wandering well off-topic, obvs, but if you're going to try to point-score at least do it with solid ground under your metaphorical feet, eh?
Still.
L
M
A
O
😒
Oh, and incidentally, look what 'real' Google produces when I enter "Christmas tamiya singletrackworld".

Must be magic, I guess.
🤷♂️
Oh, and incidentally, look what ‘real’ Google produces when I enter “Christmas tamiya singletrackworld”.
Er, that just illustrates the issue doesn't it? You can search but the dates are all wrong (first entry is actually from 5 years ago) so you can't tell which are recent threads and which are ancient. 🤷♂️
@Mark have you tried embedding dates other than as (or in addition to) structured data? Obviously, structured data is the done way but if it's currently broken it may be worth going belt & braces until Google fixes their end.
@moimoifan I’d hope the comments are not aimed at you directly, more in relation to the thread.
If you need to search for a specific thread then fair enough Google may offer what you need but, on the whole people search for generic advice around a subject. This is when the randomness of results from a date perspective crops up - pretty much every time - on the site directly or through Google.
And that is what this thread is about, so pushing a point that Google ‘works’ is not entirely relevant or accurate as a statement in the context of this said thread.
‘Laggy’ around here – just like ‘plaggy’ for plastic – same slang etymology, same result.
That’s a bit like trying to score points based on the fact that I’d call a bread roll a ‘cob’ and someone else call it a ‘batch’ or a ‘barm’.
We’re wandering well off-topic, obvs, but if you’re going to try to point-score at least do it with solid ground under your metaphorical feet, eh?
People say stupid shit all the time. Doesn't make them any more right.
You're quite gobby for someone who only joined in April. What was your last banned username, I wonder.
But it’s not easy when a system has 10 million database entries to search through while remaining functional to the users.
It really isn't. My job used to be designing and building enterprise IT systems, and in the early days building a system that didn't freeze up when the CEO asked for a report was really difficult. If it was a single timezone system you could restrict reporting to overnight runs, after the backups, not an option for global systems. In the end, the pattern adopted by most large systems was to pull all the data out into separate reporting databases, accepting all the additional complexity and costs of hardware, licences, staffing etc that that incurred. This is essentially what @Mark wants to do with Elastic Search, but despite it being open source and the basic product free to use, implementing it will be neither trivial nor necessarily cheap.
despite it being open source and the basic product free to use
They changed the licensing a few months back. The free version is quite old now.
That all just smacks a bit of “being more precise with search terms increases the likelihood of finding what you are looking for shocker”.
Whatever, it didn't work when I searched for it. Same terms and everything, it was throwing up similar threads but not what I was looking for. Don't you think that's an issue of itself, that it's not consistent?
And sometimes it's not possible to be that precise, sometimes all you remember is a couple of keywords and just have to hope they're either unique or recent enough that you're not going to be trawling through 12 years of results.
@Mark have you tried embedding dates other than as (or in addition to) structured data?
I'd ask Google what the problem was. But if they won't say then I'd guess that Google wants to see the date from the structured data displayed on the page. The 'Posted 5 days ago' under forum posts could say 'Posted 5 days ago - 25 July 2023'.