Search the web
Sign In
New User? Sign Up
archive-crawler
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 16 - 45 of 6140   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
16
Yes, it is a local name server. It could also be remote. -Reddy ... From: Gordon Mohr To: archive-crawler@yahoogroups.com Cc: Raymie Stata ;...
G.B.Reddy
gbreddysoft
Online Now Send Email
Mar 3, 2003
1:19 pm
17
Driven by our meeting with Raymie last Thursday, and refined by further analysis, here are some notes on our design directions. = STAGED CRAWLER DESIGN NOTES =...
Gordon Mohr
gojomo
Offline Send Email
Mar 5, 2003
10:29 pm
18
Gordon and Raymie, Below are the various stages and their design with the issues involved in the DNS Resolver and HTTP Client implementation. DNS History/Cache...
G.B.Reddy
gbreddysoft
Online Now Send Email
Mar 6, 2003
5:52 pm
19
Patrick Eaton forwarded me a pair of staged HTTP client implementations which are part of the OceanStore project at Berkeley, and are essentially what are also...
Gordon Mohr
gojomo
Offline Send Email
Mar 7, 2003
1:40 am
20
I've just checked into Sourceforge CVS the module 'Anecdote', a first stab at a staged crawler. Right now it just sets up dummy printing stages, grabs a list...
Gordon Mohr
gojomo
Offline Send Email
Mar 7, 2003
2:11 am
21
More insight on the DNS stages. As stated in the design earlier, "DNS Querying Stage", "DNS Response Processing Stage" and "Timeout and Retry Handling Stage"...
G.B.Reddy
gbreddysoft
Online Now Send Email
Mar 7, 2003
4:31 pm
22
Gordon, Igor, Raymie present. (1) Access to work in progress: start using SourceForge CVS (Post meeting note: 2 modules now exist there: 'Anecdote', a staged...
Gordon Mohr
gojomo
Offline Send Email
Mar 7, 2003
9:44 pm
23
I added very dumb HTTP fetching toe the Anecdote 'Fetching' stage via the Apache Commons HTTPClient library soon after my message yesterday. ... This spinning...
Gordon Mohr
gojomo
Offline Send Email
Mar 7, 2003
9:51 pm
24
These are good decompositions of the steps involved, and the LGPL dnsjava library looks very useful for our needs. My tendency would be to think fewer stages...
Gordon Mohr
gojomo
Offline Send Email
Mar 7, 2003
11:30 pm
25
Gordon, I am done with the asynchronous DNS code. I shall test it more tomorrow and checkin. I may start using the caching mechanism present in the dnsjava ...
G.B.Reddy
gbreddysoft
Online Now Send Email
Mar 12, 2003
4:15 pm
26
Gordon, I have checked in the first version of the asynchronous DNS lookup stage (DNSLookingUp.java). I have also updated the README and the anecdote.cfg file...
G.B.Reddy
gbreddysoft
Online Now Send Email
Mar 17, 2003
8:08 pm
27
I'll take a look. Don't feel obligated to go with Eclipse -- even though it is a very nice environment. Eventually we'll include versioned ant scripts with...
Gordon Mohr
gojomo
Offline Send Email
Mar 17, 2003
11:42 pm
28
Gordon, Yes, as you said dnsjava creates a new udpsocket for every message. I am planning to separate out the processing logic from the socket related code and...
G.B.Reddy
gbreddysoft
Online Now Send Email
Mar 18, 2003
2:20 am
29
I'm trying out the 'libhttp' staged HTTP code we were passed by the Berkeley OceanStore project, and it requires all aspects of the outbound request to be...
Gordon Mohr
gojomo
Offline Send Email
Mar 19, 2003
7:38 pm
30 Gordon Mohr
gojomo
Offline Send Email
Mar 19, 2003
8:59 pm
31
As I understand it, the largest header Mercator will set is: GET /foo.html HTTP/1.0 User-Agent: Mercator-1.0 Host: foo.com From:...
Raymie Stata
rstata
Online Now Send Email
Mar 19, 2003
9:19 pm
32
Gordon, I have proposed a detailed design for supporting asynchronous DNS lookups in the dnsjava libraries to its author (Brian Wellington). He is yet to get ...
G.B.Reddy
gbreddysoft
Online Now Send Email
Mar 24, 2003
4:43 pm
33
Major additions and changes: - Moved "pumping" activity into URIChoosing stage, so it can better react to depletion of URIs to consider - Converted most...
Gordon Mohr
gojomo
Offline Send Email
Mar 25, 2003
3:00 am
34
Gordon, Attached are - A proposal for the JAnt based nightly builds and JUnit based unit tests. Please review it. - Schedule for the work items with...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 2, 2003
8:13 pm
35
I think it would be most useful for you to do the SEDA "overhead determination" work ASAP, ie, reschedule it for now. Raymie...
Raymie Stata
rstata
Online Now Send Email
Apr 3, 2003
8:58 am
36
On a recent test crawl, we stepped on an interesting, unintentional crawler trap related to soft 404s, relative URLs, and the implicit typing of certain...
Gordon Mohr
gojomo
Offline Send Email
Apr 4, 2003
12:52 am
37
... I think a number of recent commercial distros have backported the O(1) scheduler themselves: Red Hat 8 and Suse 8.1, at least, maybe recent Mandrakes. -...
Gordon Mohr
gojomo
Offline Send Email
Apr 4, 2003
2:05 am
38
Gordon, I am presently working on the SEDA socket overhead determination work. I will get the results in the next two days. Some initial insights into it are -...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 8, 2003
6:06 pm
39
Thanks for the update! The Ocenstore libhttp package is known to be very rough; really just a placeholder or starting point for what we'd need. (Note that...
Gordon Mohr
gojomo
Offline Send Email
Apr 8, 2003
6:47 pm
40
Gordon, As you say, the aSocketInputStream is needed only if the client stage is multithreaded. But the oceanstore "HttpClient" stage is internally using it to...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 10, 2003
7:18 pm
41
Just to capture the idea I mentioned yesterday in the archives: A potential way to extract Javascript-synthesized URIs from web pages without integrating a...
Gordon Mohr
gojomo
Offline Send Email
Apr 11, 2003
1:48 am
42
Gordon, Please find attached the performance report on the SEDA aSocket NIO layer. Look for the last section in the document "SEDA NIO Socket Framework" and...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 11, 2003
10:19 pm
43
Thanks for the updated analysis! However, I am concerned that the results may be more a result of the test design or the specific HTTP implementation we're...
Gordon Mohr
gojomo
Offline Send Email
Apr 14, 2003
7:04 pm
44
... Tuesday isn't good for me; how about Wednesday 8:30p PT (3:30UTC) instead? ... This looks good; I've been getting the notifications. ... I just wanted to...
Gordon Mohr
gojomo
Offline Send Email
Apr 14, 2003
7:14 pm
45
Gordon, We can have our call as suggested on Wednesday 8:30p PT (3:30UTC). The test sources used are checked into CVS. Look for the packages -...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 15, 2003
5:52 pm
Messages 16 - 45 of 6140   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help