Welcome, Guest.
It's January 8, 2009, 9:42pm.
Please login or register.
Home Page The great big Category debate
SeSco    Technical Secrets    The Main Site (it's really a wiki)  ›  The great big Category debate Moderators: Admin
Users Browsing Forum
No Members and 1 Guests

The great big Category debate  This thread currently has 223 views. Print Print Thread
2 Pages 1 2 All Recommend Thread
Admin
May 26, 2008, 4:54pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
I don't really know how to phrase this, or even kick of the discussion for that matter, but I've been looking into categories and categorisation recently, and there doesn't seem to be any accepted group of these, or any sort of standardisation within the UrbEx or historical research communities. Having looked to them for guidance, there seems to be one camp that has its foot firmly placed in impossibly broad categories that are next to useless, and impossibly specific categories, that become useless because there are so many of them. The former produces member lists that are huge, and cry out for more categories, or sub-categories, to be added, while the latter has so many individual entries that it no longer does its job of reducing the number of subjects listed.

I'm throwing this open to the forum for anyone to post their tuppwnceworth, something in the way of a bit of brainstorming to see if anything interesting appears.

Anything's welcome, be it a description of how a meaningful collection of categories might be created and managed, all the way to list of categories that might apply to our subject items.

I'll be keeping away, to start with at least, as I don't want to steer any thought in any particular direction, intentionally or unintentionally.

What I have done, for the moment, is add a Category Cloud to the foot of the site's pages, showing what is available and in use at the moment.

It's not a finished item - probably one of those things that never will be, but at the moment, lacks structure.

One important definition that is in place is the rule on plurals: as our pages usually refer to a single subject or item, category name will always be singular. Yes. it will be wrong on occasion, but then again, so would the plural, and having both just means two category names for one category subject, and that's even sillier.
Logged Offline
Site Site Private Message Private message
Admin
July 9, 2008, 11:58pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
Ah well, it was worth a try in case something inspiring, or some lateral thinking was thrown in, and might have sent me off on a new path to try and address this subject.

Not surprisingly, this isn't the only site (or application) that has this problem, and I've usually been able to resolve it using normal database techniques. The key difference with those that I've designed in the past has been that their content and scope has been more focussed and specialised. SeSco content can be just about anything, ranging from people and places, to events and objects, and anything else in-between and roundabout. This renders the usual hierarchical approach all but useless, as subjects can belong to more than one category, and a category may be applicable to more than one subject.

As an example, ponder for a moment how the following subjects may each be quite distinct, but also lie within one another's possible categorisation...

battery
gun battery
anti-aircraft battery
coastal battery
searchlight battery
defence
anti-aircraft defence
coastal defence
light anti-aircraft battery
heavy anti-aircraft battery

All are valid, each could be used on its own, or with a number of others from the list, none would be incorrect, or any more or less correct than another.

I don't have a magical solution to the question, much as I would like to, and I don't know how to resolve the question using the tools available within the current site configuration and functions available.

What is clear is that this is a vast problem requiring a similarly vast solution, which is no bad thing.

Well, it is in so far as there will be no solution tomorrow. When I refer to it being no bad thing, what I mean is that I don't feel quite so stupid, as I can usually come up with an answer to this sort of thing fairly quickly, and the fact that nothing seemed to work led me to look elsewhere.

To get an idea of the size of the problem/solution, have a look at English Heritage's introduction to the National Monuments Record Thesauri for museum use. This shows how the sort of categorisation, or thesaurus, we would need to develop. We don't really need to develop it, as the is already one in existence for The Defence of Britain Project, unfortunately, it's theirs, and we can't do a sneaky download and reuse it.

It wouldn't really help, as it would also need to be able to be integrated with our code, but the word list would be nice to have, however, we can still freely refer to it for guidance.

I may spend some time studying it, and see if it provides any inspiration for reworking any of the code already in, or available for, the site, since it helps show what's been wrong with the approach so far, and why it refused to work effectively.

Wikipedia's no better. Just look at the hundreds of categories created there. It may have categories for each subject, but try and find one!
Logged Offline
Site Site Private Message Private message Reply: 1 - 17
The Fox
July 10, 2008, 10:01am Report to Moderator Report to Moderator

Secret
Posts: 1,344
My thoughts, for what they are worth, are that in general we have too many Categories or rather what are in effect sub categories.  Is there any need for categories such as " Anti Aircraft Batteries " since visitors can find these by using the search button?  

I think the main categories should be the likes of Defence, Transport, People, Civilian.  Second level categories should allow selection by date and be the likes of WWI, WWII, Cold War, Pre WWI, Inter War, Post War.  I am not sure there is any need to go further than this unless/untill the number of pages goes up into 5 figures. Mind you a geographical category would be usefull for visitors who want to find things of interest in their area although this could be covered by summary maps based on local government areas showing all the currently covered sites.

One category that I do not believe has any value at present is remains.  It throws up a miscellaneous collection of often unrelated pages.  This could be dealt with quite well by having a code letter eg R included in the page title where appropriate.  
Logged Offline
Private Message Private message Reply: 2 - 17
Admin
July 11, 2008, 12:30am Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
Ta. Some interesting thoughts "from the other side" as it were.

Some specifics relevant to some of the points...

Categories such as "Anti-Aircraft Battery" (categories are always singular for consistency) are indeed needed, because the operation of the search function means that a search posed as anti aircraft battery will return every page that contained the word anti, aircraft or battery. In order to force the search to return anti aircraft battery, the visitor would need to know that the phrase has to be enclosed in quotation marks "anti aircraft battery".

There's also the hyphen that fouls up the search, and then the use of battery or batteries.

There's a further small problem, the page would have to have the exact phrase used on it in order to be found by the search.

And one last hiccup, the search would return EVERY page containing the search term, including all the pages that just happen to mention an anti-aircraft battery as a passing remark somewhere in the text.

Oops, I'm used to these, and take them into account without thinking, and didn't realise there were so many pitfalls for the unwary in finding by search rather than category.

Unfortunately, even a second level category isn't possible with the current code, so we're stuck with the "main" ones as it were. There is a method of doing something similar in the existing code, but I did have a try at it once, and while it does certainly work, it's not a runner as it's really high maintenance, and needs someone to create everything that goes into it, and set each category and sub-category by hand, including the ones that can exist in more than one main category. Anyone trying to use it would need training!

It does/would work, and I wrote something similar for an account/expenses database which just about worked for the untrained user, and actually had 5 levels of sub-category under the main ones. So it was even more complex, but easier to use. I think the reason this worked was because in any given application, the possible expense entries were restricted to a known subset of possible entries, while the SeSco stuff is much more generalised.

Geo-tagging would be good, but I rejected it early on on the basis of criteria - name of: street, village, town, city, county, region - which to include, which to leave out? (Not asking, just indicating problem. Some places would work better with some criteria than others.)

I think I referred to the existing irritation of the map-points living on the individual pages, rather than a single list from which they could be accessed as a group

"Remains" is a humble little category that came over from another application - it may or may not be useful here, but is/was worth a try, and quite a few of the categories actually came from other projects I've run elsewhere. Remains was very useful on the others, as one of the aims was to let strangers know if there was any point in actually heaving themselves out of their comfy seats and visiting a site. In particular, it was used with Exists. If a site Exists, it may or may not, have been complete, so Remains told them it was not complete, but consisted of only bits, or remains.

It's probably not right to say that there are too many categories,  but based on what we've seen, there is the problem of some of them indeed being sub-categories, or, if you can bear with the NMR thesauri theory, there are BTs and NTs - broad terms and narrow terms - and some sort of rationalisation is called for.

I'll add those thoughts - and any more that anyone might care to add in, the thread remains open - and keep rolling them around and comparing them with what we have in the code library for handling them with some degree of automation, because any solution that needs anything more than the tiniest manual intervention after creation is simply not practical or workable.
Logged Offline
Site Site Private Message Private message Reply: 3 - 17
the_historian
July 11, 2008, 3:29pm Report to Moderator Report to Moderator
Illusion
Posts: 107
I agree with Fox, with the provisio that any geographical divisions should be kept to North, South, East west variety based on the Forth-Clyde line, with Orkney/Shetland perhaps also thrown in. From my own experence, basing things on local govt areas goes to hell on a handcart the minute they decide to rearrange things like in 1996.
Logged Offline
Private Message Private message Reply: 4 - 17
The Fox
July 12, 2008, 7:34am Report to Moderator Report to Moderator

Secret
Posts: 1,344
I did not mean sub categories in the terms of "categories of lesser value" and hence requiring several lines of code to work.  Any apposite group of category names has to be either in a vertical list or to folllow on consecutively.

As a result one has to be first and I was suggesting this should be a general list of areas of interest.  This would allow the casual visitor to decide where to go even if it is somewhere else.  

Assuming the search engine works on a word + word basis or " word + word "  basis it should then be able to present a visitor with a reasonable list of cogent pages when general category is accompanied by another e.g. WWI..

Again assuming that the search engine is capabale of word+word+word or "word+word+word" operations then adding a third category as a geographical area would be possible.

I assume from my very limited knowlege of how the oily bits work that none of the above would preclude a visitor going just for a geographical area in the first place as the categories might be presented as 3 heirarchies but as far as the search engine in concerned they are all of equal value and from the same list.
Logged Offline
Private Message Private message Reply: 5 - 17
Admin
July 12, 2008, 12:31pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
There's currently no way to achieve sub-categories of any form. Each category exists in complete isolation, so there can be no 'first' category in here, or any sort of category hierarchy.

To give an idea of what can be done, and also of how quickly it would grow to need a manager employed to look after it as the quantity of categories grew, here's an example.. Ignore the start of the page and just look at the Category Nesting section - I'll apologise in advance if it's unintelligible (but I've no other example to point to), it took me ages to decipher the logic, and I understand the code that makes it work! The method works fine, but is really only of use in personal systems, with one person in control, or a formalised, team-based project, controlled by a procedural manual containing a set of rules for all to work to.



I think I've failed to convey the important part of the search explanation, as you said "Assuming the search engine works on a word + word basis or " word + word "  basis", and "Again assuming that the search engine is capabale of word+word+word or "word+word+word" operations", as these are complementary operations.

You have to appreciate that placing words between quotes makes them into a single, unique, string of characters - as seen by the search. In other words, the words used don't actually matter (unlike the case where the words are just typed with a space between them).

Thus "anti aircraft battery" is seen by it as a 21-character word, including the spaces, and is all it will match. If a page contains "anti-aircraft battery" (hyphen added) it will be ignored, as it's a completely different 21-character word.

If the user puts the three words anti, aircraft and battery in the search box, not enclosed in quotes, then the search will return every page that contains the words anti, aircraft, and battery, but with no concern for the word order. In other words, a page with the sentence "Advanced aircraft are fitted with an anti condensation system powered by the battery" would be returned by the search made without quotes.

Hope I've done better this time

There's also a very brief set of examples shown if you hit "Search" in the Search Box, which takes you to the Search Page, rather than hit the "Go" button, which start the search immediately.
Logged Offline
Site Site Private Message Private message Reply: 6 - 17
The Fox
July 14, 2008, 9:05am Report to Moderator Report to Moderator

Secret
Posts: 1,344
I wonder if this might not be the right question.  May be we need to be more philosophical about it.  

What is the website for, apart from our own agrandisment?

Who else uses it and for what reason?

What kind of information would they be looking to find?



Logged Offline
Private Message Private message Reply: 7 - 17
Admin
July 14, 2008, 1:32pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
I'll go for the philosophy, but unless we use different dictionaries, there's not much opportunity for aggrandisement in a site that numbers its (largely anonymous) "member" using less than the fingers on one hand after three years, but that's probably another subject.

I've only been able to give intermittent attention to this recently, but if it's not a bad use of words, one of the things that has jumped out from the above (for me at least) is that there are probably three groups of category that would be useful for the content we have. I know, not much help since we can really only have one collection of categories, but you never know what might come out of the debate.

The following isn't definitive, or a description of what's in use, just rambling inspired by what's come up so far.

The first is the simplest, and comprises the various groups that each subject we might have an article or page for could belong to. These would generally be things such as Boom Depot, Pillbox, Coastal Battery, Degaussing Station, Fuel Depot, and maybe even something less tangible like Event. These are reasonably narrow categories that would be pinned down by adding something like a place name or similar to them.

The second is a little more complicated, and less tangible. It exists separately from the first, and might be described as groups that would allow someone to find groups of things that interested them. So this could be groups like Military, World War II, Industrial and Cold War for example. There's also things like Lost, Abandoned, Remains and the like that came up earlier.

There is a third type, but it doesn't really need any thought, as it is defined by external sources. I'm thinking here of something like the Clyde AA Defences and Loch Ewe Defences for example, assigned by others and reused by us. This could also include any special cases that come up, such as Postwar AA Battery Conversions, which didn't exist until we started ferreting around.

I think this might possibly be angling itself around to some sort of criteria for creating categories.

One thing that did make itself known recently, which was a bit of a surprise, and that was that there was no category that would pull up all the anti-aircraft batteries. While we might have identified those in places like the Clyde, Forth and Loch Ewe, there are others we have a page for, but that don't fall into those greater groups, and couldn't be found as a single group. Apart from World War II, there's currently nothing identifying them, or making them easy to extract - and that's probably a fair indication that a category's missing, and I've missed an opportunity.

I do know why, and I've mentioned it before, but these batteries were the first to go in, and back them, references (made by others) were just to AA batteries, and the HAA and LAA differentiation only came along later, after much of the content had gone in. HAA is probably meaningful now, with hindsight.

However, don't think too hard - just the odd bit of stimulation needed.
Logged Offline
Site Site Private Message Private message Reply: 8 - 17
Admin
July 14, 2008, 3:29pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576


I think the current by-election must be getting to me... and I managed to use the preceding question as the basis of an answer to a more convenient question which wasn't posed by the questioner

The aim of the categories is (for me at least) to pre-empt the mindset of the visitor, and try to guess in advance at what might be in their mind and attracting them to the main site. While the Contents may provide a list of everything in the site, it's fairly indigestible, and meaningless in many cases, as the individual names will, in many cases, only mean something to someone that already knows them.

By creating suitable categories which all the contents can be divided into, and which are more generalised and understandable, anyone (including "us" if I can use that generalisation) can later find something without having to remember the specific name, which can be difficult if it is something we found and added a while back, might be obscure, and not intuitive. This clearly becomes more important as the number of items grows too.

That was how I got diverted onto the two/three types of category I wandered off into with the previous reply.
Logged Offline
Site Site Private Message Private message Reply: 9 - 17
Admin
July 28, 2008, 9:46pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
While I was removing the category Gun Battery - which is more or less redundant - I decided to reuse it as a Category to list the various types of gun battery.

We have two members at the moment, HAA Battery and Coastal Battery, we don't have any categorised yet, but can also have LAA Battery.

If you select Categories from the main site menu, it will list the revised Gun Battery category, and selecting it will list the two members, HAA and Coastal.

Unfortunately, since the same report lists the pages in the selected category, and there are no pages in the Gun Battery category, it still tries to list them, and provides a blank listing. It's a bit unprofessional looking, and distracting, but because it is a valid, but empty list, I can't quite see how to suppress it.

However, the idea is really to float the concept of an all-encompassing category that can list more detailed categories that fall within it - in this case, HAA and Coastal are specific kinds of Gun Battery.

Does this inspire any similar classifications that suggest themselves as being useful or helpful?

Or should this idea just be quietly forgotten?
Logged Offline
Site Site Private Message Private message Reply: 10 - 17
Apollo
July 29, 2008, 9:30pm Report to Moderator Report to Moderator

Forewarned is Forearmed
Secret
Posts: 3,368
There's probably a naming or terminology conflict in this, but no immediate solution.

What this seems to show is the category Gun Battery, with the sub-categories HAA Battery and Coastal Battery, but you can't call the existing categories sub-categories.

You need a name for higher level categories that will exist over and above sub-categories that will allow you to refer to them as categories.

Not suggesting this is right name, but conceptually, Gun Battery would be a super-category in this system.

Perhaps something simple like Group or Class would do the job. I tried searching around for other alternatives, but the only other word that turn up is... category
Logged Offline
Private Message Private message Reply: 11 - 17
Admin
July 30, 2008, 1:09am Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
Took the above on board, and while there's nothing I can see that can be done to make Gun Battery any different in terms of its being a category at the same level as the examples of HAA Battery and Coastal Battery (we only have one level of category available to play with, so there simply can't be sub-categories)  I can now see how they can be be made to appear on the page for the Gun Battery category.

I've also used the Gun Battery page as a subject page itself, where common information for all kind of gun battery can be placed, and links added.

Slightly confusing, at least until it is seen working, is the definition of the Gun Battery as a Category, and HAA Battery and Coastal Battery as Categories within that Category - note carefully the singular and plural endings on the words. It's very important, and is what allows them to be identified and treated separately, even though they are both just categories. Best to look at the two pages and see the difference in content.

There is no link anywhere yet for the experimantal top level Category page (note, not Categories, which live inside a Category) so this following link goes to the top level Category page. It only has the one entry for Gun Battery, but shows it works.

This has the potential for use in the same way elsewhere, as there are instances where you may write a description for something that would be a sub-category, and then find the same word would be good to have on perhaps two or more other pages. Clearly, this is rather silly, because when it later becomes a good idea to update that information, or add a link, you either have to try and remember what those pages were - so you can go and add the same changes to them, or leave them unchanged and out of date. Either option is not good, whereas if there was a page that they all belonged to and that explained them in some way, then that could be changed once, making the new info available to all.

I think this might have a future. Possibly. Maybe...
Logged Offline
Site Site Private Message Private message Reply: 12 - 17
Admin
July 30, 2008, 4:11pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
This is turning into a bit of an education, as I considered that Z Batteries should have been included in the overall category, but since these are rocket batteries without guns, it's hard to see how they could be included under Gun Battery.

Turns out the fundamental error lies in the use of the word "gun", and I should be using the word "artillery".

The terms gun and howitzer often confuse (there's a surprise). Technically:

    * Guns have a single propellant charge and long range, and cannot fire in the upper register - an elevation angle greater than 45 degrees;
    * Howitzers have multiple charges and can usually fire in both upper register and lower register.

So, we're actually wrong to speak of guns in AA batteries, they're howitzers, but usage has over-ridden the definitions.

Nonetheless, if I want to include the Z Battery within an overall category of such installations along with the AA Battery and the Coastal Battery, the Category of Gun Battery is going to have to become Artillery Battery.

I knew this would get complicated!
Logged Offline
Site Site Private Message Private message Reply: 13 - 17
The Fox
July 30, 2008, 8:22pm Report to Moderator Report to Moderator

Secret
Posts: 1,344
I am not convinced that we need this level of detail.  

Context ie WWI,WWII etc. should be enough for most people allied to a geographic listing.  

Visitors would either be looking for something specific in which case the search engine would find it

or be interested in or be researching a partucular period of time in which case a list of period pages would allow them to find what they were looking for without blanking out things they might not have thought of,

or be looking for interesting items in a particular area and I do not agree with Historian in that just north, south etc is sufficient.  I do feel that local government areas are the way to go.  Boundary changes only happen very occasionally and even then usually affect a small area.

There is a danger in making the categories thing too complicated and scaring people away.
Logged Offline
Private Message Private message Reply: 14 - 17
Admin
July 30, 2008, 10:00pm Report to Moderator Report to Moderator

Ctrl-Alt-Del-Aye-Right!
Admin
Posts: 576
Don't worry about the detail, or complexity, since it ultimately comes down to Muggins to manage.

The important thing in here is to stimulate some thoughts - if they're workable they'll be played with and influence the Categories, if not, they'll silently slip into the shadows.

I don't really want to critique individual items, but we don't have any geographic categorisation, and I'm not suggesting it be started (yet?), but the north, south, east, west is both too vague, and dependent on personal perception. Think of where English peeps think "The North" is!

The local government areas would probably do fine, but even their boundaries are a pain to find - get it wrong and I guarantee there would be a string of disgusted locals delivering virtual sackloads of complaints about having their assets stolen - well, maybe not, but you probably know what I mean.

I have to say I'm holding out for inspiration, and finding out some sort of easy, automated, way to put our point on a main map like I have on the ROC post page. However, as the data is in a completely different format, I don't see that coming soon. Maybe a long winter night project

I might add that the current categories were created as best guesses with nothing to guide them, some are good, some are bad, some will stay, some will go. For example, Gun Battery has gone (as an individual page category - it is now a top level category, and will change to Artillery Battery soon). Remains will probably be targeted next, since this can be implied for most items we report on. More useful, and likely to stay, is Lost, which is an important category as it contains sites that have been lost, are no use for visit (unless to disprove the categorisation) and are politically and culturally significant, especially if recently lost to development or lack of care.
Logged Offline
Site Site Private Message Private message Reply: 15 - 17
The Fox
July 31, 2008, 7:30am Report to Moderator Report to Moderator

Secret
Posts: 1,344
I agree wholeheartedly about the Lost category as a classification but I cannot image someone looking for Lost items.  However it deserves to be saved from Room 101.

I tried the remains category once and was disappointed. Partly because the way the categories are listed it looked as if  I was only choosing WWII remains and I got the lot which wasn't very helpful!  I still think this category could be dealt with by adding (R) or something similar to the title of the item.
Logged Offline
Private Message Private message Reply: 16 - 17
Apollo
July 31, 2008, 8:50am Report to Moderator Report to Moderator

Forewarned is Forearmed
Secret
Posts: 3,368
From using other sites that specialise in location based content - which I guess a lot of things in SeSco are - defining an item as Lost is like waving the old red flag to a bull to some folk, they take it as an invitation for it to be proven wrong, and promptly head off to go and find it. A few of us have done this on related sites, and the result has been the rediscovery and documentation of a number of subjects that were classed as Lost by officialdom.

As we've learnt elsewhere, the professionals have limited time to inspect a site, and if they fail to find any immediate evidence of a subject, they're obliged to move on. Also,  many of the existing records were made before there was easy access to the detailed aerial views we now enjoy - I could certainly have benefited from these a few years ago.

Logged Offline
Private Message Private message Reply: 17 - 17
2 Pages 1 2 All Recommend Thread
Print Print Thread

SeSco    Technical Secrets    The Main Site (it's really a wiki)  ›  The great big Category debate