Welcome to Resource Zone.

Submission feedback

I posted this in SUBMISSIONS but there was not one reply!

So... too dumb an idea to bother with, already been covered many times, too difficult, too.....

Here goes again.

Every single submitter of a site wants to know at the very least - WHEN. With the many thousands of editors that help ODP I imagine there is a lot of choping and changing of editor/category/sites to review. I understand the logistics must be horrendous.
However, if some of the very many bright heads got together and worked out a model whereby a resonable guesstimate of 'time to review' could be applied to a category based on past experience and other factors and then this information could be relayed to a submitter, well...

Surely this would save the ODP a tremendous amount of work regarding duplicate site submission issues and make many many submitters very happy.

Also I suspect most of the 'when will my site get reviewed' posts in Submissions forum (a high proportion) would disappear.

Sure it won't be easy to come up with a good prediction algorithm might it might be really worth it.

Tom
 

IMHO and my knowledge of programming, the prediction algorithm for this would have to be very good. Not only that you have to remember human beings can hardly be predicted by a computer. The algorithm would have to be nothing short of AI to predict even a rough estimate of the "time 'til reviewed."

Here are some of things to think of in considering this idea:

1) Editor's do not edit a set # of sites a day.
2) Editor's (usually) have more than one category.
Some of these have more unreviewed than others.
3)(most) Editor's have a real life, etc...
4) Some editor's are also Meta's, editall's, cateditall's too on top of the categories that they are listed
in as editing.
 

Perhaps a chaos theory mathematician could figure it out. How do you write an equation which would have to take into account every editors production, determine what is happening in their job and family life (yes many of us do have jobs and families!) and every categories submission and spam rate. Using my immense math skills I've figured it to be...oh wait, what about those editors who go out and find sites without them being submitted, there goes the 0, carry the X...ugh I'm stuck.
 

Even if some sort of algorithm could calculate how long it would hypothetically take, I'd imagine that some submitters would be even angrier if they were given a *suggested* time frame and their site was not listed by then - no matter the reason.

Humans DO do it better - but it might take longer than a computer that can run URLs 24/7 without stopping for work/school/supper.
 

gti96

Member
Joined
Feb 28, 2002
Your idea isn't necessarily a bad one. It's actually pretty interesting to me, as my RL job is 100% mathematical probabilities and equations. But as others have already posted, it would take time to set parameters and assumptions for the many variables involved; add to that the programming time. I'd really rather see our IT staff working on better ways for us to get sites listed faster than produce some guesstimate of how long it "might" take a site to get listed.
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
I have about a dozen categories on my dashboard. When I edit, I open about three or four windows, and start each one on whatever activity strikes me at the moment -- a category with lots of robozilla-flagged problems or lots of unreviewed sites, an ongoing project, the day's e-mail, etc.

Wherever I go, I tend to pick out the obviously-simple-to-handle cases: egregious mis-submissions, palpable spam, probable-spam, etc. Quite often such cases turn out to be unexpectedly complex -- the same abuser is pestering other categories also, and the problem needs to be addressed everywhere. In any case, that window remains devoted to that task till I go to a site that crashes the window, or I get bored with that kind of site (or that kind of problem.)

Go ahead -- model that on your computer. (It wouldn't be that hard, actually.)

Now try to base a prediction on it, remembering well that as the number of greens gets smaller, so does the chance I'll work on the category (either because I wasn't alerted to it in the first place, or having been alerted, was distracted by psychological or technical circumstances.) I think (because of the number of random factors, and the positive feedback correlated to large numbers of greens) you'll find that you'll get almost the same predicted date for any category to which I have access. Now, I'm an editall. Go figure.

Here's the real problem: I just spent 4 hours today dealing with submissions. Do you know how many actual sites I listed? Zero. I spent about 4 hours yesterday, although I think I actually listed a couple of sites. That works out to over 90% of the time spent dealing with spam. Other editors' mileage will vary, but I suspect that for most active editors, there will be a similar proportion.

That means that the wait for site listing, however long it is, is at least ten times longer than it would be if it weren't for the spam submissions.

Now, all of you who are concerned about these delays and yet don't have time (or the grammatical proficiency) to be editors: go out and do something constructive about it: find a spammer and break all the bones in his fingers. The world will thank you. And if enough of you do the socially-conscious thing, eventually the backlog will get shorter.
 

dstanovic

Member
Joined
Mar 26, 2002
hutcheson,

<< That works out to over 90% of the time spent dealing with spam. >>

That is amazing /images/icons/frown.gif I believe that if someone is spamming the ODP they should be removed for a year, if they keep it up they should be banned permanently! You guys/gals have better things to do than waste your time with these people when there is such a backlog of legitimate sites trying to get in. It's to bad there is not some type of across the board filtering system that could automatically detect uncategorized/legitimate sites from the spammers.

Dave S.
 

lissa

Member
Joined
Mar 25, 2002
I think the idea would have a possibility of success IF one thing were true - quality sites were actually submitted to the correct category in the first place. Meaning that the sites submitted weren't affiliates, or mirrors, or vanity URLs, or deeplinks, or submitted 100s of times all over, or under construction, or so poorly set-up that an editor can't figure out what it is about. Also meaning that the category is correct based on DMOZ guidelines and usage not on the submitter's wishful thinking.

If the above were the case, then there would be a chance of figuring out a time estimate based on the number of unreviewed in that category, number of editors for that category and their activity level.

However, if quality sites were actually submitted to the correct category, submitters wouldn't need a time estimate because editors would be able to spend the majority of their time editing and could probably get a lot further caught up than we are.

I realize that it is hard for the average submitter to understand what is really going on, because they assume everyone else is trying to submit properly as they have tried, but unfortunately that's not the case. The higher levels of categories have the fewest number of editors, the most spam, and the most wishful-thinking submissions. It can take a long time to process things from those categories. Because the senior editors focus at the high level, they can't spend as much time as they would like in lower levels without editors, so it can take a long time in a lower level too.

So how long does it take to get listed in a locality? A day to a week if it was submitted to the correct locality and there is an active editor. Over a year for a site submitted improperly to topical while it gets moved once or twice and then waits for a passing editor to review and publish it.

I know it is a frustrating situation for submitters. Please understand that it is very frustrating for editors too!
 
D

darker

This discussion has come up many times in many places. So out of curiosity, I ran some statistics during the month of march 2002, and here are the summarized results:

I was able to list 20 submissions, and took action on one change request.
At the same time, there were 107 submissions that I couldn't list in the categories I edit. About half of those were submissions of sites already listed correctly or mirrors and redundant deeplinks thereof. The rest was divided among sites in the wrong language (sent to World/), of clearly regional focus (sent to Regional/), submitted to the wrong category (sent to the correct one), without any content or nonfunctional URLs, and with any of a number of other problems (deleted).

As you see from my almost 20% success rate, I edit in areas with relatively small amounts of misguided submissions. Lucky me! /images/icons/smile.gif

During the same month, I decided to hunt for dead links. So I checked all the categories I edit in the Google directory, and had a closer look at those listings that didn't show a green bar there (an extremely useful feature!). Once at it, I also browsed through the older listings, to pick out any suboptimal placements or descriptions that would jump to my eye. This activity resulted in the removal of 20 dead or doubly-listed sites, and in sending 12 to more appropriate categories. On the positive side, I could fix the URLs and/or descriptions of 20 others.

Those who counted with me will have noticed that I wasn't able to help growing the directory last month. Fortunately, I had a new category created the month before that I collected 30 sites for on my own, so I don't feel too bad. In sum, I still like to think that I was able to improve the average quality of the categories I edit in.

Back to the original question, I don't think an automatic estimation of the processing time is possible with satisfactory accuracy, even though it would be nice if it was. The main obstacle lies in the fact that individual editors are tending to individual submissions. The easy cases may get processed (either deleted or listed) almost immediately. The difficult ones may sit in the queue for a long time, until the editor finds the time to search for a more appropriate category for them, or to ask other editors for their opinions. The only somewhat reliable statement that could be made is for heavily backlogged categories, in that it might take "a long time" there.
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
>>I believe that if someone is spamming the ODP they should be removed for a year, if they keep it up they should be banned permanently!

Well, for extreme cases something like this is possible. But it has to be pretty extreme.

There are all kinds of, um, supererogatory submissions. Some of them are due to sheer stupidity. (Internet MLM schemes, for instance, tend to draw from the bottom of the gene pool.) Some are ignorance -- perfectly competant and honest businessmen not knowing enough about the internet, or search engines, or hierarchical directories, to understand what kind of submission makes sense. Some of the sites are innocent victims of venal and/or incompetant professional site promoters. Some may be software glitches -- either ODP or some site submission program. And, after all, the ODP is in the business of listing sites for surfers, not of executing justice.

But if you want to understand the nature of the difficulties to be overcome in creating a comprehensive web directory that accepts free submissions even from commercial sites, the fact that site submission can't include either an IQ or ethics check on the submitter, must be considered.
 

Thanks for the posts - very interesting.

So it confirms, to me at least, that editing is just like life!

Please bear with me.

My wife asks me to perform a non-urgent task, I quickly calculate what is going on in my life and say "sure, probably some time during the first week in May darling."

"Great" she says, "don't forget now dear".

I didn't need a PHD in maths to give an answer that was quite acceptable and not locked in concrete.

See where I'm going to?

OK so each category has an "expected time to review" box (only seen by the editor) which they update now and then based on their guesstimate. It is not locked in concrete, nothing legal about it, changeable.

When a submission arrives an auto responder emails the submitter -
"Dear Sir based on current number of submissions for this editor the anticipated review date would not be before 15th. May and may be longer etc....disclaimer, disclaimer"

This would satisfy many (not all) but many submitters and I am confident would reduce the editors workload.

The main point being that a perfect solution is not necessary, just one that might make submitters and editors HAPPIER.

What do you think?
 

sabre23t

Member
Joined
Mar 26, 2002
How about this?

Take a gander at the "Last update:" field at the bottom of the DMOZ category. Consider (today - last_update) to be the first cut at an estimate of your expected submission review period.

It really is a first cut estimate, since it ignores the number of unrevieweds in that category, only the frequency of editing in that category. However, number of unrevieweds in that category is not available to the public.
 

dajeffster

Curlie Meta
Joined
Mar 27, 2002
Hi all,

ThomasAJ, I agree with you it would be great if submitters had a rough idea how long a review will take. Unfortunately, the human side of the ODP makes it very open ended.

Speaking for myself, I know I can never predict how much time I will have for editing. Some weeks I can go through a few hundred sites and other weeks it could only be 10.

Once you put a "date" on when a review will occur, missing this deadline only leads to disappointment, regardless of disclaimers attached.

In many businesses, an estimated time is usually longer than the time they feel they will need. A contactractor once told me a job would take a month, he was done in three weeks. I was thrilled because he finished a week sooner. If he had taken a week longer then his estimate, I would have been disappointed, his crediblity would have diminished, and I would never hire him again.

As you pointed out, and rightfully so, with non-urgent tasks, it is just to have an idea on the situation, but many submitters don't see getting their site listed non-urgent. An example mentioned in one of threads in these forums: in the shopping category submissions peak in October and November with sites hoping to be listed for the Holiday season, while at the same time the volunteer editor has more demands on their time away from ODP. Unlike a store, there is no "seasonal help" to assist in busy times.

Since it is all a volunteer effort, the only thing that would speed up the time would be for more people to volunteer.

I know if a system were implemented to require (not the best term to use in a volunteer project) an estimated time, I would be hard pressed to come up with a date. The more requirements implemented, in the long run, may do more to discourage editors and make the wait time much longer.

I know this is not the type of comment/answer you were looking for, I just hope it helps you understand that while it is a great idea, in practice it would not be very feasible (IMHO).

Jeff
 
K

kujanomiko

About a week ago, I happened to be editing a high spam cat. There were over 500 submissions if I was correct, and I whittled it away in about an hour to about half of that (don't remember exact numbers). And that was just skimming the surface for the obvious spam and duplicate submissions, not investigating the deeper, perhaps more-to-the-story ones...
 
Top Bottom