Search the web
Sign In
New User? Sign Up
archive-crawler
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 34 - 63 of 6147   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
34
Gordon, Attached are - A proposal for the JAnt based nightly builds and JUnit based unit tests. Please review it. - Schedule for the work items with...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 2, 2003
8:13 pm
35
I think it would be most useful for you to do the SEDA "overhead determination" work ASAP, ie, reschedule it for now. Raymie...
Raymie Stata
rstata
Online Now Send Email
Apr 3, 2003
8:58 am
36
On a recent test crawl, we stepped on an interesting, unintentional crawler trap related to soft 404s, relative URLs, and the implicit typing of certain...
Gordon Mohr
gojomo
Offline Send Email
Apr 4, 2003
12:52 am
37
... I think a number of recent commercial distros have backported the O(1) scheduler themselves: Red Hat 8 and Suse 8.1, at least, maybe recent Mandrakes. -...
Gordon Mohr
gojomo
Offline Send Email
Apr 4, 2003
2:05 am
38
Gordon, I am presently working on the SEDA socket overhead determination work. I will get the results in the next two days. Some initial insights into it are -...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 8, 2003
6:06 pm
39
Thanks for the update! The Ocenstore libhttp package is known to be very rough; really just a placeholder or starting point for what we'd need. (Note that...
Gordon Mohr
gojomo
Offline Send Email
Apr 8, 2003
6:47 pm
40
Gordon, As you say, the aSocketInputStream is needed only if the client stage is multithreaded. But the oceanstore "HttpClient" stage is internally using it to...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 10, 2003
7:18 pm
41
Just to capture the idea I mentioned yesterday in the archives: A potential way to extract Javascript-synthesized URIs from web pages without integrating a...
Gordon Mohr
gojomo
Offline Send Email
Apr 11, 2003
1:48 am
42
Gordon, Please find attached the performance report on the SEDA aSocket NIO layer. Look for the last section in the document "SEDA NIO Socket Framework" and...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 11, 2003
10:19 pm
43
Thanks for the updated analysis! However, I am concerned that the results may be more a result of the test design or the specific HTTP implementation we're...
Gordon Mohr
gojomo
Offline Send Email
Apr 14, 2003
7:04 pm
44
... Tuesday isn't good for me; how about Wednesday 8:30p PT (3:30UTC) instead? ... This looks good; I've been getting the notifications. ... I just wanted to...
Gordon Mohr
gojomo
Offline Send Email
Apr 14, 2003
7:14 pm
45
Gordon, We can have our call as suggested on Wednesday 8:30p PT (3:30UTC). The test sources used are checked into CVS. Look for the packages -...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 15, 2003
5:52 pm
46
I've moved some things out of the Anecdote CVS module, as that was never intended to be the all-inclusive home of our work. The socket tests have moved to a...
Gordon Mohr
gojomo
Offline Send Email
Apr 16, 2003
7:26 am
47
Gordon, The various tests and the results that we got today are as follows. ( In the below lines, Java downloader means the HTTP downloader which we used...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 16, 2003
4:35 pm
48
Gordon, I had a look into the SEDA code to understand the synchronization issues which we agreed on the day of discussion could be the reason behind low...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 18, 2003
5:05 pm
49
Gordon, The updated performance doc is attached. Please review the "test results section" and the "other misc results section" in the SEDA NIO pages. The JDK...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 21, 2003
5:43 pm
50
Gordon, The attached package contains the necessary sources and scripts to run the ( SEDA and non-SEDA ) downloaders. A readme is also present in it. The seda...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 23, 2003
7:50 am
51
Thanks! Some thoughts: I'd like to approach this part of the system -- buffers/streams for multi-Kb entities across one processing cycle -- at three separate ...
Gordon Mohr
gojomo
Offline Send Email
Apr 23, 2003
9:33 pm
52
Raymie, On the first day we discussed about the memory pool manager, we decided that the 8MB big chunk of memory will be broken into pieces of 4K each. And the...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 24, 2003
5:11 pm
53
Gordon, The MemPoolManager updates are checked into the CVS in the ArchiveOpenCrawler module ( in the same org.archive.crawler.io package ). Some of the...
G.B.Reddy
gbreddysoft
Online Now Send Email
Apr 29, 2003
11:25 pm
54
Gordon, I am presently working on doing buffered i/o over RandomAccessFile on the spilled files. On some of the other issues listed below, please send in your...
G.B.Reddy
gbreddysoft
Online Now Send Email
May 5, 2003
2:19 pm
55
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
Gordon Mohr
gojomo
Offline Send Email
May 12, 2003
6:06 pm
56
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
Gordon Mohr
gojomo
Offline Send Email
May 12, 2003
6:06 pm
57
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
Gordon Mohr
gojomo
Offline Send Email
May 12, 2003
6:07 pm
58
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
Gordon Mohr
gojomo
Offline Send Email
May 12, 2003
6:25 pm
59
Gordon, Thanks for the clarifications. I will work on it to get it done. We shall have the weekly conf call tomorrow at 8:30pm PST. The updated project...
G.B.Reddy
gbreddysoft
Online Now Send Email
May 13, 2003
3:15 pm
60
At our Friday April 25th meeting at the Archive, we decided that in the interest of having a demoable and focused-usable crawler as soon as possible, we would...
Gordon Mohr
gojomo
Offline Send Email
May 14, 2003
1:15 am
61
... Reviewing this document ("CVSInstructions.txt"), I don't fully agree with putting everything in a single CVS module. In particular, I still want to use a...
Gordon Mohr
gojomo
Offline Send Email
May 14, 2003
1:15 am
62
Raymie pointed out an interesting possibility in design comments a while back: that DNS lookups that occur during the crawl could be handled as just another...
Gordon Mohr
gojomo
Offline Send Email
May 14, 2003
1:35 am
63
Gordon and Raymie, Attached Synch.zip contains the following changes on the Sync model. -- A new SampleLinkExtractor.java added which does some preliminary...
G.B.Reddy
gbreddysoft
Online Now Send Email
May 15, 2003
2:49 am
Messages 34 - 63 of 6147   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help