The Internet - drowning in its own waste?
#1
Hi,

Yesterday I was looking for the lyrics of a song that I only half remembered. Used Google and eventually found it. But in the processes, I found many sites that had nothing to do with what I was looking for. And using the find function of the browser, I often could not find even one (non trivial) word of my search string on the page I'd clicked to.

Today I felt like browsing on one of my favorite subjects, arms and armor. I have a folder of favorites/bookmarks on that topic. Of the twenty two items in that folder, five are dead links and four have moved since I linked them. I have hundreds of pointers stored in 28 main folders, many with sub folders. I dread to think how many of those are now dead.

Other than by outright lying, I can think of three good ways to keep a person ignorant: tell them nothing, tell them too much to assimilate, tell them the truth and then change it. Seems that the Internet (and not just the web, consider over thirty seven thousand news groups with bunches added (and abandoned) weekly) is doing a pretty good job on two of the three ways. Not to mention that a lot of the content *is* outright lies.

So, is the net still the "Information Superhighway" that will make us all informed beings, or is it an accident saturated, hopelessly gridlocked, tangle of paths leading to nowhere of interest? I suspect that, without some sort of control, Sturgeon's law (squared) will apply. But with most forms of control, it'll become just another spin factory. The original solution, which happened by happenstance, of only letting people of a certain "class" (often CompSci 101 :) ) onto the net worked pretty well. But there is no way to go back to the days before the September That Never Ended.

So, what do you think. Will technology control the Frankenstein's monster that technology spawned? Or will the Net join broadcast TV as a form without function, a signaless source of noise?

Me? Well, I wouldn't be me if I didn't think it was going to hell on the express. :) The advantage of being a pessimist: I've never been disappointed, often vindicated, and sometimes pleasantly surprised.

--Pete

How big was the aquarium in Noah's ark?

Reply
#2
So, is the net still the "Information Superhighway" that will make us all informed beings, or is it an accident saturated, hopelessly gridlocked, tangle of paths leading to nowhere of interest?

It's somewhere in between. At least in our corner of the world, people's interest level is often the limiting factor in their pursuit of knowledge, rather than access or learning capacity. I think that was true before the internet days, so the internet could only be evolutionary rather than revolutionary in terms of building an educated populace. The libraries have been there, and really are not too inconvenient, but only those who have enough interest in the specific information are going to use the resources available. I suspect the same thing could be said in most countries where internet access is now widely available.

The internet leads to many places of interest, but one man's interest is another man's tripe. The key, then, is to be able to refine methods to cut through all the tripe that someone else apparently thought was worth putting on a {website, email, listserve, etc}. Our searching capability needs to be able to keep up with the pace at which the information swarm grows. The system is still working at this point, but it could certainly be better. The google-style search engine was built for a much smaller information system than the one it is now asked to navigate, but I'm confident that it will be replaced in time with something more up to the task. Of course that is only one way out of many to find what you are looking for. This site started with a set of useful, related links, and I notice that many people use the posters of this forum as a portal to find useful sites for just about anything.
Reply
#3
Quote:So, is the net still the "Information Superhighway" that will make us all informed beings, or is it an accident saturated, hopelessly gridlocked, tangle of paths leading to nowhere of interest?

Who said that the intarweb is an information superhighway that will make us intelligent? I never heard that... Although it is full of information, I don't think it makes us particularly intelligent :huh: but whatever. Whoever you heard that 'information superhighway' from, they were right; tar intarweb is full of information, but not necessarily truthful information.
Reply
#4
Gennerallly systems lose efficency when the components realize how to "achive" status/power by manipulating the system rather than changing themselves substancially.

Its a common problem in my exsperience. Radically different concepts in user perspective do seem to balance this out, but by neccesity such responses tend to happen as dramatic shifts.

You can see this in evolution, advertising, even building controls.


The issue of dead links though, that just expected in dynamic systems.
Reply
#5
I started to write a post talking about big business and advertising and interactivity. I struggled with it because I kept coming to the same point you were drawing. Its going down and the hand is waving. The net offers the opportunity for people with a passion to express it freely. The Lurker Lounge is an excellent example of that. Just as with TV, most of what is out there is designed to draw in advertising dollars. Although there are good sites that do rely on advertising, those dollars do tend to bend most sites to their will. Witness how many searches take you to other "search engines" or sites full of banner ads where the only place you find your search term is in the keywords designed to draw you in. All this drek is the water the net surfers are drowning in. Finding the sites that do justice to their passion means swimming up through all the rest. Most people expect things to just work, like the toaster sitting on their counter, so they believe that what they can reveal through a simple search is the be all end all on their subject. They can't or won't take the time and effort to even understand the computers they search on, why would they put such effort into using something they consider part of that computer? Computers 101 should be required of every person purchasing a computer followed by Internet 101 for anyone signing up with an ISP, maybe even Internet 102 for those choosing AOL or MSN. :P
Lochnar[ITB]
Freshman Diablo

[Image: jsoho8.png][Image: 10gmtrs.png]

"I reject your reality and substitute my own."
"You don't know how strong you can be until strong is the only option."
"Think deeply, speak gently, love much, laugh loudly, give freely, be kind."
"Talk, Laugh, Love."
Reply
#6
unrealshadow13,Jul 23 2004, 04:09 PM Wrote:Who said that the intarweb is an information superhighway that will make us intelligent? I never heard that... Although it is full of information, I don't think it makes us particularly intelligent :huh: but whatever. Whoever you heard that 'information superhighway' from,  they were right; tar intarweb is full of information, but not necessarily truthful information.
He never stated that it would make anyone intelligent. Just informed beings. Though the Internet has been called the Information Superhighway for quite some time, long enough to be commonly accepted term.
Reply
#7
Pete,Jul 23 2004, 01:13 PM Wrote:Hi,

Yesterday I was looking for the lyrics of a song that I only half remembered. 

Pete what is the Name of the Song ?

Today I felt like browsing on one of my favorite subjects, arms and armor.  I have a folder of favorites/bookmarks on that topic. 

Dito I have about 1000 URLs & I find many dead...it's easier Google than search thru the folders.

Not to mention that a lot of the content *is* outright lies.

Yes just like in the Real World...>disception at its worst

So, is the net still the "Information Superhighway" that will make us all informed beings, or is it an accident saturated, hopelessly gridlocked, tangle of paths leading to nowhere of interest? 

All the above, Except "nowhere of interest" I Still find the WWW interesting, I can only Imagine what it will be like in another 10-20 years

So, what do you think.  Will technology control the Frankenstein's monster that technology spawned?  Or will the Net join broadcast TV as a form without function, a signaless source of noise?

My answer Below  :wub: 
Hi

I'm 62 years young, I remember sitting at a Radio for Entertainment. Imagine what life was before the Radio..."Ouch"...now you have PAY Radio. :o

I remember the Birth of TV & the Birth of PAY TV

YEP you see where I'm going with this, although we are already there..."PAY Internet"...Pay & Play Games.

The question is, will the content be Better when we PAY...not, IF we will Pay ? :blink:
________________
Have a Great Quest,
Jim...aka King Jim

He can do more for Others, Who has done most with Himself.
Reply
#8
King Jim,Jul 23 2004, 06:42 PM Wrote:The question is, will the content be Better when we PAY...not, IF we will Pay ?  :blink:
If Internet follows the Pay TV vs Free TV or the Pay Radio vs Free Radio model, then yes, the content will be somewhat better.

Not good enough, just better. As Pay Radio is stil not as good content wise as one's own music collection, it is better than the free radio stations that all play the same garbage.
Same for Television. The content on premium channels is not as good as if one just stuck to DVDs (no commercials, only buy what you actually like), but it is significantly better than on the air network programming.

However, if the Internet follows the model of pay to play games versus the standard games, then we're all in trouble ;) .
Reply
#9
Information, or rather mis-information, is a huge problem. Its alarming just how wrong most things are. Whether its the internet, text books, or the news on the tv, things are just plain ol' jumbled up. Before the internet TV was hands down the worst. Now the that the internet is abound, we are able to bask in the splendorous glow of the king of bad information.

Much like the question of whether or not human beings are inherently good or evil, the question of is technology ultimately good or ultimately bad for humans is a tough cookie to answer. I think a large problem is that most people take technology as an inherently good thing, without ever really questioning it. Are the good of hospitals worth the bad of missles? Are the good of artificially flavored candy worth the bad of cancer?

To steer this behemoth of an open ended question back on course, I believe the internet will continue to be a waste land of misinformation. Going to hell on express? How many more Battle.net forums do you need to witness before you realize that it's already there.

There are always shiny objects of good and truth in the rubble, and I suppose that's why we continue to use the internet: for the possibility of it bringing good. And its there if you look hard enough, but sometimes its harder than fighting the Diablo Clone with a naked barbarian.

-Munk
Reply
#10
It really becomes a question of what you expect information to look like.

Here, first off, how many people do you think you've educated Pete? Online, or offline? One way or another, interaction is one of the primary educating factors in a person's life, and through the internet, you're able to interact with many more people than you would otherwise. Though it's not a hard rule to go by, intelligent people are the ones who are usually learnt from. Ignorance spreads, of course. But do you know more by the interactions you've had on the internet? Probably yes. And a lot of it, through the filter of common sense and checking other sources, is probably pretty close to true.

Secondly, the primary crutch of the Internet and the Information superhighway is the fact that it's run by people. People are a lot of things, but the real defining point is that they're all different. Some websites are well organised and educating, some are crass, commercial outlets. Some are bizarre, poorly designed, blobs of information presented in the same way you'd eat if the buffet restaurant exploded. The problem here is that the Internet looks exactly like it would if every person out there in every city built their own business from the ground up, their own roads, gave their own directions and presented their own ideas. Imagine a city where every 15 year old built his own house, showed you pictures of his life and presented you with ideas about all the bizzare thoughts in his head.

Those two factors go together to form the internet : Information through interaction and infinite ability to express and create interaction. With that in mind, I'm not really all too surprised that as the internet is pushed and changed by the rapidly expanding user base into something that reflects them. Of course, that doesn't mean it's a hellish wasteland. It's like everything else within humanity, though.

I'd still say I've learnt a great deal from the internet. I'm well suited to using it, mind you, but regardless I think I get just as much out of it as I did almost ten years ago. It's changing into more of a human place, but that was always going to be the end result.


On a side note, I like to think of myself as a realist.
My other mount is a Spiderdrake
Reply
#11
Hi,

blobs of information presented in the same way you'd eat if the buffet restaurant exploded

I like that image. I like it very much. Kinda like gourmet garbage. Thanks :)

--Pete

How big was the aquarium in Noah's ark?

Reply
#12
Hi,

Pete what is the Name of the Song ?

Well, the song turned out to be Men of Harlech and I found a number of versions, two of the best being here and here. But neither of those two is exactly the version I was looking for. Or at least, I don't think so. The version I was thinking of was the one sung by the Welshmen in Zulu.

Dito I have about 1000 URLs & I find many dead...it's easier Google than search thru the folders.

Yeah, if you could ever get back to the info you found the first time. But since most of the good sites were found by following a link from a link from a Googled site, and half the time it's not even obvious which link got you there during the same session, if a link is dead, that info is lost.

--Pete

How big was the aquarium in Noah's ark?

Reply
#13
Like people have said, better search engines are very important for making the internet really work, because now it's hit or miss. Somehow a T.V. guide equivalent is needed. The main difference between the internet and, say, books at home is that the internet might give me crap, but sometimes I hit paydirt I wouldn't have found any other way. I only found the Lurkerlounge from searching for diablo 2 strategies, before here I never looked at forums before, plus the "talking" has got me a lot of interesting information. If I hadn't been looking just to look, I wouldn't have found all that. The main point is that all this media stuff is mostly useful for finding new things, they might be misses, but the hits are uaually worth it.
I may be dead, but I'm not old (source: see lavcat)

The gloves come off, I'm playing hardball. It's fourth and 15 and you're looking at a full-court press. (Frank Drebin in The Naked Gun)

Some people in forums do the next best thing to listening to themselves talk, writing and reading what they write (source, my brother)
Reply
#14
Another factor rarely mentioned is the way the search engines have evolved, and are evolving, the ammount of control the user has over the search. Few people, for example, know what a difference adding such things as quotation marks "", brackets (), and other such little characters can make. Don't believe me? Google the following searches:
1) "frisbee muffin"
2) (frisbee muffin)
3) frisbee muffin

Very contrived example, obviously, but the difference is huge. Look here: www.google.com/help/refinesearch.html for details on how google can make your search much more succesful.

I guess the way I see it, the internet is similar to most things in life -- real value burried in plenty of garbage. You either avoid both the good and the bad, slog through randomly hoping to get lucky, or do your best to improve the odds.

The other thing I'd like to mention are websites, well, such as this one. Hitting a five year anniversary proves the lounge is here to stay, and yet the information available here changes almost every second. For fixed numbers and stats and information, I try to keep the most stable, "official" sites I can bookmarked. For more variable information, and, in particular for "newer" info, I try to keep a few forum type websites within easy access. If I can't google it or figure it out on my own, chances are someone online can point me in the right direction.

gekko
"Life is sacred and you are not its steward. You have stewardship over it but you don't own it. You're making a choice to go through this, it's not just happening to you. You're inviting it, and in some ways delighting in it. It's not accidental or coincidental. You're choosing it. You have to realize you've made choices."
-Michael Ventura, "Letters@3AM"
Reply
#15
Ok this is a weee bit off topic, but I've been wondering for quite some time, and we're talking about the workings of search engines...

I'm pretty sure that search engines have a huge database of websites. The search itself isn't conducted over the entire internet, but only through the catalogued (sic?) websites in storage. If this is true, then how do websites get into the database in the first place? If this whole database thing isn't the way it works, and the serach engine is actually scanning the entire intarweb, how does it work? Does it jump from IP to IP? Wha... :huh:
Reply
#16
Hi,

first, please correct me if I'm wrong, but brackets in search queries don't do anything in Google.

Quote: Another factor rarely mentioned is the way the search engines have evolved, and are evolving, the ammount of control the user has over the search.

I wish I could agree, but I'm afraid just the opposite is the case. I don't know for how long you know about and use Google, but I remember the time Google was released to the public, along with several papers describing the internal architecture of the whole system and the PageRank algorithm, which is the main reason for Google being so popular (besides being fast and not diluting the results with paid-for advertisement) because it did a great job on sorting the results. Anyway, back then Google (like most other search engines) allowed AND, OR, +, -, () and wildcards in search queries, along with the options to find words in URLs etc as they do now. The users were able to specify exactly what to search for, they had complete control and a powerful query language. It was paradise.

Then, a time came when search engine companies found out that more and more "normal", aka non-geek, people started to use the net, and that the majority of these people were confused with all the options and did not know what to do with all the operators, let alone how wildcards work, and that these people actually were scared away from search engine sites which offered these. So search engines like AltaVista started to hide these options behind small "advanced search" links, and stopped to list all query options, in order to attract more people - but at least they still offered these now-hidden options.

Then, one day, I was suddenly thrown out of paradise: Google stopped supporting wildcards, OR, and structuring queries with brackets (OR found its way back in eventually). Instead, they introduced stemming and synonyms as replacements for the stupid. Argh! While I see the usefulness of stemming, why on earth have they cut the powerful tools I used every day, especially wildcards?!? I guess they did it to increase server-side performance, but Google lost a lof of appeal for me that day. I sent several mails to them asking for a reason and complaining, but never got an answer back.

So while you're right that most people don't know how to refine their search in Google, there are still people out there who know how they could refine their search even more if Google still allowed it, and the amount of control the user has over the search in Google has decreased actually. There are still a lot of search engines out there allowing the full set of query options, yes, but compared to Google's (now surely enhanced) PageRank algorithm, they lack in the way results are sorted for relevancy, so they are no real competition. If anybody knows about a search engine using a similar algorithm to sort results which also offers the full set of query options, I'd be a very happy man!

-Kylearan
There are two kinds of fools. One says, "This is old, and therefore good." And one says, "This is new, and therefore better." - John Brunner, The Shockwave Rider
Reply
#17
Quote:I'm pretty sure that search engines have a huge database of websites. The search itself isn't conducted over the entire internet, but only through the catalogued (sic?) websites in storage. If this is true, then how do websites get into the database in the first place? If this whole database thing isn't the way it works, and the serach engine is actually scanning the entire intarweb, how does it work? Does it jump from IP to IP? Wha... 

Yes, the search engine scans (more or less) the entire web - in Google's case. I don't know all the exact details, but it includes following links on scanned webpages to scan more webpages, for example.
There are other search engines that are catalogue-based, though, i.e. edited by humans. Google used to be both, and I think they still are. They have both a human-edited catalogue and a bot-created database.

On the original topic: I am a lot more optimistic than the other posters in this thread seem to be. For me, the Internet is still very very useful, both as an access to information and as a medium for social interaction. And I can't see that changing anytime soon. In fact, I think the usefulness is increasing, for example with projects like Wikipedia.
I don't share the impression that the garbage is growing so much faster than the useful information. If it is, then it is - at least for me - still easy enough to sort out all the garbage.
If the garbage grows by 100% and the quality information grows by 10% and you can easily ignore all the garbage, then in the end, the overall quality has improved by 10% ;)
I think that the really dangerous threats to the net come from other developments: Over-commercialization and government control.
Reply
#18
Yes, it is a cesspool and it's getting worse. That doesn't mean it can't get better. I am pessimistic that it will as well.

Some ideas for correction;

1] Enforceable rules and a worldwide network of Internet enforcement agencies. A new packet protocol that requires encryption, and tightly embeds the packet senders credentials. While I'm an advocate of anonymous speech, I'm a bigger proponent of free speech. Free speech adds the responsibility of the content to the speaker. What the Internet needs now is accountability of content back to the originator. I want to be able to track spammers and virus authors directly and in order to do that I need to give up being anonymous.

2] We need a clear physical divide between information and trash. If it is done by IP address, then I would suggest that as a first step the internic adopt a 5th octet IP address standard with the current Internet being zero. Then, I would start to assign the fifth address by information provider category. Or, it could be done by domain name, but in either case someone needs to enforce that there is a difference between the information offerings of a library, or wikipedia and a porn site. If we can constrain all the adult content a particular physical address, government information to another, commerce organized by industries, university research to others, etc. Then at least each community can be policed using its own community standards or watchdogs

3] Another obstacle to the Internet being an "Information Superhighway" is copyright. Because there is little control or accountability anyone who publishes anything on the web in effect offers their work to the public domain. The fan fiction, or unique photograph you publish on your web site today my be in print and earning someone fat royalties in Thailand tomorrow. Without worldwide agreements on copyright protection, there will be less of the quality content by professionals and more of the musings of amateurs. So, this goes back to point 1]. With worldwide rules and enforcement, and accountability on individuals who violate the rules, then content providers may be willing to publish more on the Internet.

4] Better portals. What you experienced in your dead links is a malaise of the Internet. It is partly a symptom of a down economy where web site owners, or ISP's are unable to continue due to financial problems. Google, Yahoo, etc need to shed their dead links. There has always been a certain degree of transience of content due to some people publishing vast and useful stuff on their university or personal accounts. It would be nice if the portals offered a forwarding service like the NCOA. Then there is the burden of all the pages that are just out dated. Another portal problem is that it has become a standard practice for content providers to store information in databases that are inaccessible to web crawlers looking for linkable content.

"Will technology control the Frankenstein's monster that technology spawned? Or will the Net join broadcast TV as a form without function, a signaless source of noise?"

I'm an advocate for it becoming what it was intended, but pessimistic. If I had $100 on the line I'd have to say "broadcast TV as a form without function -- and lacking the FCC".

:) Good topic.
”There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy." - Hamlet (1.5.167-8), Hamlet to Horatio.

[Image: yVR5oE.png][Image: VKQ0KLG.png]

Reply
#19
Do they have a help button somewhere that is just hard to see? That would be really useful.

Something like a short section talking about how all the different options work would be really helpful in this case, since a lot of people like me probably don't know you can try stuff like parentheses and quotation marks to change searches. Just sticking the information there wouldd make it a lot easier to use.
I may be dead, but I'm not old (source: see lavcat)

The gloves come off, I'm playing hardball. It's fourth and 15 and you're looking at a full-court press. (Frank Drebin in The Naked Gun)

Some people in forums do the next best thing to listening to themselves talk, writing and reading what they write (source, my brother)
Reply
#20
Hi,

first, you have to distinguish between web catalogues (human-generated directories of web sites, structured by content) and "real" search engines that try to index the whole public internet (that means, all sites reachable through web links - sites which can only be seen if you know the exact URL, but no other public sites link to them, won't appear in the search results unless the owner of that site has manually registered the site with the search engine). Google is one of the IIRC three "real" search engines left in the market.

Here's a quick, simplified overview of Google's architecture:

[Image: google_arch.gif]

Basically, Google is split into thee parts. First, there's the part responsible for fetching and indexing the web sites. Web crawlers (sometimes also called harvesters, or spiders) are fed with a list of URLs to visit, for example if you register your homepage with Google. A crawler donwloads the page and stores it into the repository. It also examines each meaningful word of the site, ignoring stop words like "the", "a", "and" etc., and if it encounters a new word, stores it into a lexicon. Additionally, for each word in the page, a link from that word in the lexicon to the web site is added in the hit list, together with additional information about the word in that context, for example how often the word is used in the page, if it's used inside a headline, inside a link etc.

Then, after indexing the web page, it takes all URLs (links) found in that page, and adds them to the list of pages to examine next. Different strategies exist how to order that list, for example breadth-first (indexing *every* page linked to on that page first before indexing all links found on the page of the first link and so on), depth-first (indexing *all* pages found through the first link from the page, then indexing *all* pages found through the second link and so on), and more-or-less intelligent mixtures of both, all with their own set of advantages/disadvantages. These crawlers need a lot of bandwidth, and should be located on different internet backbones to speed up indexing, and should start with URLs from different parts of the web.

The second part of Google are the databases already mentioned: Lexicon, Hit Lists, and Repository. Intelligent data structures for fast queries are mandatory.

The third part is the user interface we all see when surfing to google.com; it gets search queries from the user, asks the database about them, and presents the results.


This kind of architecture has several consequences: First, not all pages from the net are indexed. I've forgotten the exact figures, but a surprisingly large amount of the net is not reachable by any link and thus not present in the index. Second, It needs a lot of time until the whole net is indexed, and only then the crawlers can re-visit sites they have already visited before. Thus, it needs time before any changes to a web site will be carried over into the search engine database. Some years ago, it could take months until search engines realized a URL was dead, or that a site had changed. Later, turnaround times were reduced to weeks, and nowadays Google claims that it will only need days for an update, though I somehow doubt they manage to crawl the "whole WWW" completely in just a few days! Instead, I suspect they "cheat" by visiting major sites more often and delay updating sites that are searched for only seldomly.


If you want to know more details, I can recommend this paper from the founders of Google, Sergey Brin and Lawrence Page (after whom the PageRank algorithm is called), about the anatomy of early Google. It has been written way back when, on happier days when Google wasn't yet commercial and wasn't forced to fight search engine spammers, and thus could afford to publish its ideas in a scientific paper. :)

Hope that helps.

-Kylearan
There are two kinds of fools. One says, "This is old, and therefore good." And one says, "This is new, and therefore better." - John Brunner, The Shockwave Rider
Reply


Forum Jump:


Users browsing this thread: 4 Guest(s)