Since there is a one-to-one correspondence between concepts
and their cyc:CycAnnotations_v1#label attributes,
I have modified mke to display output using the attribute
instead of the concept name. The translation between concepts
and attributes uses a new GDBM table, cyclabel.
Here is an example of the new mke input and output.
rhm@rhm8200 /home/knowledge/opencyc
$ export KEDB=./db; export VIEW=latest
$ mke
ke$ do hofind od "AynRand"@en in cyclabel done;
Mx4rvr0Q45wpEbGdrcN5Y29ycA has cyc:CycAnnotations_v1#label "AynRand"@en;
ke$ Mx4rvr0Q45wpEbGdrcN5Y29ycA has ?;
"AynRand"@en has
/ rdfs:label "Ayn Rand"@en;
/ "wikipediaArticleURL"@en "http://en.wikipedia.org/wiki/Ayn_Rand";
/ "prettyString"@en "Rand"@en;
/ cyc:CycAnnotations_v1#label "AynRand"@en;
/ "wikipediaArticleName"@en "Ayn Rand";
ke$ exit;
$
Dick McCullough
http://mkrmke.org
PM: I can find programmers after I receive a project spec, so first I need to know what needs to be done, then I can think of who can do it.
The best adviser here Dick McCullough, he is supposed to be a leader of the programming group, although he is an American and can't formally participate.
I ask Richard to tell something about the GESS project, and what programming works it will need.
We posted some expermental results as Global Entity Search System, which you can test it here: http://semanticwww.Emetasearch. Some comments below.
>I haven't studied the implementation. What I think you are saying is that > replacing a key allocates new disk space, rather than reusing the old space, > and that the old space is never subsequently used for a later insert or replace. > This is surprising, I would expect it to be on a "free list" that would get used > to satisfy a later request. > > In looking at the C API, there is a C function that it can use to > "garbage collect", > gdbm_reorganize(), and that function is very slow but does what you need. So > the question is what is the best and most Uniconish way to provide access to > this function. We could, for example, make it happen automatically, or we could > make collect(db) do the job. > > Clint > > On Sat, Aug 22, 2009 at 3:57 PM, Richard H. > McCullough<rhmccullough@...> wrote: >> >> Do you think GDBM will "release" the old file space >> if I delete key before writing new value?
This is how I created the 4 subhierarchy databases that you wanted.
# online: save hierarchy database
$ cd mke/db/usecs/entity
$ mkzip E18
# download and install in /home/ke/db
$ cd /home/ke/db
$ unzip E18.zip
# offline, create each subhierarchy
$ mke
ke$ do hodelete od BOTTOM done; ke$ BOTTOM isa ?; BOTTOM => BOTTOM, Reality -- (root of hierarchy); ke$ TOP isc ?; TOP => TOP, the World, Existence, Universe -- (top of hierarchy); / World => world, resource, existence, universe -- ( ); ke$ exit;
rhm@rhm8200 /home/knowledge/tap $ date;gdbmload --genus.species <tap.gs.csv;date Sun Aug 16 22:48:06 PDT 2009 ##### main(arg[1] = --genus.species) ##### ##### genus_species_main(arg[1] = --load) ##### ##### load_hogenus() ##### gdbm fatal: lseek error. Sun Aug 16 22:53:27 PDT 2009
rhm@rhm8200 /home/knowledge/tap $ ls -l $KEDB total 2097268 -rwxr-xr-x 1 rhm None 3072 Aug 16 22:48 hogenus.dir # 3 kB -rwxr-xr-x 1 rhm None 2147597033 Aug 16 22:53 hogenus.pag # 2 GB output
As expected, processing CSV is faster than subject-verb-object
Only took 5:21 min:sec to reach 2 GB crash
You'll notice that I didn't load the structure.ho hierarchy.
Instance categories are:
rhm@rhm8200 /home/knowledge/tap $ mke ke$ Resource isc ?; Resource ; ke$ do hotop done; 0 genera for BusinessMagazine; 0 genera for SnowsportMagazine; 0 genera for WesternRegionalMagazine; 0 genera for GameMagazine; 0 genera for "CMUSCS_ResearchArea"; 0 genera for PhotographyMagazine; 0 genera for HuntingFishingMagazine; 0 genera for Country; 0 genera for HistoryMagazine; 0 genera for Perennial; 0 genera for AfricanAmericanMagazine; 0 genera for ConsumerElectronicsCorporation; 0 genera for CMUPerson; 0 genera for CollectiblesMagazine; 0 genera for WomensFashionMagazine; 0 genera for Author; 0 genera for ChouChouBabyDoll; 0 genera for UnbelievablySoftBabyDoll; 0 genera for ComputerScientist; 0 genera for Musician; 0 genera for BoatingMagazine; 0 genera for Movie; 0 genera for Cartoon; 0 genera for Book; 0 genera for MusicSceneMagazine; 0 genera for SpiritualityMagazine; 0 genera for HorseRidingMagazine; 0 genera for Chemist; 0 genera for SportMagazine; 0 genera for Territory; 0 genera for SoftwareCompany; 0 genera for ComputerEquipmentManufacturer; 0 genera for BrainBoardGame; 0 genera for MovieMagazine; 0 genera for ElectronicGamingMagazine; 0 genera for MusicalInstrumentMagazine; 0 genera for GolfMagazine; 0 genera for Athlete; 0 genera for RoseBush; 0 genera for City; 0 genera for WomensMagazine; 0 genera for HomeGardenMagazine; 0 genera for CMU_RAD; 0 genera for Architect; 0 genera for Composer; 0 genera for Annual; 0 genera for PlayingCardGame; 0 genera for SocioReligiousEvent; 0 genera for AutomotiveMagazine; 0 genera for UnitedStatesCity; 0 genera for MysteryBoardGame; 0 genera for AudioEquipmentMagazine; 0 genera for FishingMagazine; 0 genera for PersonalComputerGame; 0 genera for ChildrensMagazine; 0 genera for DesignDecoratingMagazine; 0 genera for SpecialtyFoodMagazine; 0 genera for CMUGraduateStudent; 0 genera for MotorcycleMagazine; 0 genera for Actor; 0 genera for Person; 0 genera for NewsMagazine; 0 genera for NatureMagazine; 0 genera for "CMUFace"; 0 genera for ComicStrip; 0 genera for CommunicationMedium; 0 genera for CMUFaculty; 0 genera for SurfingMagazine; 0 genera for Evergreen_Shrub; 0 genera for MovieDirector; 0 genera for Gender; 0 genera for CrosswordMagazine; 0 genera for MusicGroup; 0 genera for Herb; 0 genera for CookingMagazine; 0 genera for WoodworkingMagazine; 0 genera for Vine; 0 genera for PetBirdMagazine; 0 genera for EntertainmentMagazine; 0 genera for MensPornographyMagazine; 0 genera for FamilyParentingMagazine; 0 genera for FashionModel; 0 genera for Religion; 0 genera for MidwestRegionalMagazine; ke$
I am making very rapid progress in connecting the last 50 nogenus concepts.
I am using mke assertions and hierarchy editing commands:
species isa genus;
rhm@rhm8200 /home/rhm $ mke --commands dump hobottom hodelete => do hodelete od species [from genus] done; hodisplay hofind homove => do homove od species from genus 1 to genus 2 done; horeduce hotop => mke --input "do hotop done;" > nogenus.txt print read reset flag select set flag size write
Here are a couple of snips from my last few minutes of activity.
ke$ ss_012209 isa* ?; ss_012209 => dissemble, pretend, act -- (behave unnaturally or affectedly; "She's just acting"); \ TOP => TOP, the World, Existence, Universe -- (top of hierarchy); ke$ do homove od ss_012209 from TOP to Act done; ke$ ss_012209 isa* ?; ss_012209 => dissemble, pretend, act -- (behave unnaturally or affectedly; "She's just acting"); \ Act => act -- ( ); \\ Action => action, activity, process -- ( ); \\\ Change => change -- ( ); \\\\ Entity => entity, being, thing -- ( ); \\\\\ Reality => reality -- ( ); \\\\\\ World => world, resource, existence, universe -- ( ); \\\\\\\ TOP => TOP, the World, Existence, Universe -- (top of hierarchy); ke$ Interaction isa* ?; Interaction ; ke$ Interact isa* ?; Interact => interact -- ( ); \ Relate => relate -- ( ); \\ Action => action, activity, process -- ( ); \\\ Change => change -- ( ); \\\\ Entity => entity, being, thing -- ( ); \\\\\ Reality => reality -- ( ); \\\\\\ World => world, resource, existence, universe -- ( ); \\\\\\\ TOP => TOP, the World, Existence, Universe -- (top of hierarchy); ke$ do homove od ss_008667 from TOP to Interact done;
ke$ ss_008667 isa* ?; ss_008667 => compete, vie, contend -- (compete for something; engage in a contest; measure oneself against others); \ Interact => interact -- ( ); \\ Relate => relate -- ( ); \\\ Action => action, activity, process -- ( ); \\\\ Change => change -- ( ); \\\\\ Entity => entity, being, thing -- ( ); \\\\\\ Reality => reality -- ( ); \\\\\\\ World => world, resource, existence, universe -- ( ); \\\\\\\\ TOP => TOP, the World, Existence, Universe -- (top of hierarchy);