Skip to search.

Breaking News Visit Yahoo! News for the latest.

×Close this window

archive-crawler

The Yahoo! Groups Product Blog

Check it out!

Group Information

  • Members: 795
  • Category: Cyberculture
  • Founded: Dec 1, 2002
  • Language: English
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Hear how Yahoo! Groups has changed the lives of others. Take me there.

Messages

Advanced
Messages Help
Messages 34 - 63 of 8123   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand Author Sort by Date ^
34 G.B.Reddy
gbreddysoft Send Email
Apr 2, 2003
8:13 pm
Gordon, Attached are - A proposal for the JAnt based nightly builds and JUnit based unit tests. Please review it. - Schedule for the work items with...
35 Raymie Stata
rstata Send Email
Apr 3, 2003
8:58 am
I think it would be most useful for you to do the SEDA "overhead determination" work ASAP, ie, reschedule it for now. Raymie...
36 Gordon Mohr
gojomo Send Email
Apr 4, 2003
12:52 am
On a recent test crawl, we stepped on an interesting, unintentional crawler trap related to soft 404s, relative URLs, and the implicit typing of certain...
37 Gordon Mohr
gojomo Send Email
Apr 4, 2003
2:05 am
... I think a number of recent commercial distros have backported the O(1) scheduler themselves: Red Hat 8 and Suse 8.1, at least, maybe recent Mandrakes. -...
38 G.B.Reddy
gbreddysoft Send Email
Apr 8, 2003
6:06 pm
Gordon, I am presently working on the SEDA socket overhead determination work. I will get the results in the next two days. Some initial insights into it are -...
39 Gordon Mohr
gojomo Send Email
Apr 8, 2003
6:47 pm
Thanks for the update! The Ocenstore libhttp package is known to be very rough; really just a placeholder or starting point for what we'd need. (Note that...
40 G.B.Reddy
gbreddysoft Send Email
Apr 10, 2003
7:18 pm
Gordon, As you say, the aSocketInputStream is needed only if the client stage is multithreaded. But the oceanstore "HttpClient&quot; stage is internally using it to...
41 Gordon Mohr
gojomo Send Email
Apr 11, 2003
1:48 am
Just to capture the idea I mentioned yesterday in the archives: A potential way to extract Javascript-synthesized URIs from web pages without integrating a...
42 G.B.Reddy
gbreddysoft Send Email
Apr 11, 2003
10:19 pm
Gordon, Please find attached the performance report on the SEDA aSocket NIO layer. Look for the last section in the document "SEDA NIO Socket Framework" and...
43 Gordon Mohr
gojomo Send Email
Apr 14, 2003
7:04 pm
Thanks for the updated analysis! However, I am concerned that the results may be more a result of the test design or the specific HTTP implementation we're...
44 Gordon Mohr
gojomo Send Email
Apr 14, 2003
7:14 pm
... Tuesday isn't good for me; how about Wednesday 8:30p PT (3:30UTC) instead? ... This looks good; I've been getting the notifications. ... I just wanted to...
45 G.B.Reddy
gbreddysoft Send Email
Apr 15, 2003
5:52 pm
Gordon, We can have our call as suggested on Wednesday 8:30p PT (3:30UTC). The test sources used are checked into CVS. Look for the packages -...
46 Gordon Mohr
gojomo Send Email
Apr 16, 2003
7:26 am
I've moved some things out of the Anecdote CVS module, as that was never intended to be the all-inclusive home of our work. The socket tests have moved to a...
47 G.B.Reddy
gbreddysoft Send Email
Apr 16, 2003
4:35 pm
Gordon, The various tests and the results that we got today are as follows. ( In the below lines, Java downloader means the HTTP downloader which we used...
48 G.B.Reddy
gbreddysoft Send Email
Apr 18, 2003
5:05 pm
Gordon, I had a look into the SEDA code to understand the synchronization issues which we agreed on the day of discussion could be the reason behind low...
49 G.B.Reddy
gbreddysoft Send Email
Apr 21, 2003
5:43 pm
Gordon, The updated performance doc is attached. Please review the "test results section" and the "other misc results section" in the SEDA NIO pages. The JDK...
50 G.B.Reddy
gbreddysoft Send Email
Apr 23, 2003
7:50 am
Gordon, The attached package contains the necessary sources and scripts to run the ( SEDA and non-SEDA ) downloaders. A readme is also present in it. The seda...
51 Gordon Mohr
gojomo Send Email
Apr 23, 2003
9:33 pm
Thanks! Some thoughts: I'd like to approach this part of the system -- buffers/streams for multi-Kb entities across one processing cycle -- at three separate ...
52 G.B.Reddy
gbreddysoft Send Email
Apr 24, 2003
5:11 pm
Raymie, On the first day we discussed about the memory pool manager, we decided that the 8MB big chunk of memory will be broken into pieces of 4K each. And the...
53 G.B.Reddy
gbreddysoft Send Email
Apr 29, 2003
11:25 pm
Gordon, The MemPoolManager updates are checked into the CVS in the ArchiveOpenCrawler module ( in the same org.archive.crawler.io package ). Some of the...
54 G.B.Reddy
gbreddysoft Send Email
May 5, 2003
2:19 pm
Gordon, I am presently working on doing buffered i/o over RandomAccessFile on the spilled files. On some of the other issues listed below, please send in your...
55 Gordon Mohr
gojomo Send Email
May 12, 2003
6:06 pm
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
56 Gordon Mohr
gojomo Send Email
May 12, 2003
6:06 pm
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
57 Gordon Mohr
gojomo Send Email
May 12, 2003
6:07 pm
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
58 Gordon Mohr
gojomo Send Email
May 12, 2003
6:25 pm
Sorry for not getting back to you sooner while I was travelling. Re: VirtualBuffers I think that initially, it is OK to assume that the virtualbuffers are only...
59 G.B.Reddy
gbreddysoft Send Email
May 13, 2003
3:15 pm
Gordon, Thanks for the clarifications. I will work on it to get it done. We shall have the weekly conf call tomorrow at 8:30pm PST. The updated project...
60 Gordon Mohr
gojomo Send Email
May 14, 2003
1:15 am
At our Friday April 25th meeting at the Archive, we decided that in the interest of having a demoable and focused-usable crawler as soon as possible, we would...
61 Gordon Mohr
gojomo Send Email
May 14, 2003
1:15 am
... Reviewing this document ("CVSInstructions.txt"), I don't fully agree with putting everything in a single CVS module. In particular, I still want to use a...
62 Gordon Mohr
gojomo Send Email
May 14, 2003
1:35 am
Raymie pointed out an interesting possibility in design comments a while back: that DNS lookups that occur during the crawl could be handled as just another...
63 G.B.Reddy
gbreddysoft Send Email
May 15, 2003
2:49 am
Gordon and Raymie, Attached Synch.zip contains the following changes on the Sync model. -- A new SampleLinkExtractor.java added which does some preliminary...
Messages 34 - 63 of 8123   Oldest  |  < Older  |  Newer >  |  Newest
Add to My Yahoo!      XML What's This?

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help