Search the web
Sign In
New User? Sign Up
fast_dev · FAST Search Engine Developers

Group Information

  • Members: 133
  • Category: Software
  • Founded: Feb 2, 2009
  • Language: English
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Hear how Yahoo! Groups has changed the lives of others. Take me there.

Messages

  Messages Help
Advanced
get fields which are not used in any collection   Message List  
Reply Message #112 of 113 |
Re: get fields which are not used in any collection









Garth, thanks for your suggestions.

Eventually we used following approach, which seems to work:

- get a sample fixml file of a collection we want to check (via getfixml)

- for each fieldname defined in index-profile,
check if it exists in the fixml file (we used
'name=\"{0,1}[a-zA-Z0-9]+$FIELDNAME'.
This regexp might be too greedy but works for us.)

- if there are no matches found, the field seems not to be used in the
collection.

Regards,
Patrick


--- In fast_dev@yahoogroups.com, Garth Grimm <gdgrimm@...> wrote:
>
> I don't really know of an "easy" way to do this (so I hope some others provide
> better answers), but here's some ideas:
>
> 1) If you've been keeping documentation on what fields are read from each data
> source, and what fields might be created in any customized pipeline
> stages, review it to identify the ones that are needed.  Whatever fields are
> left over are candidates for removal.  Take that list and compare it to the
> fields that are in the default Index Profile, removing from the list any
fields
> that there.  This should give you a list of fields that can be removed.
> 2) You could write a script that steps through each field in the Index Profile
> doing a wildcard query.  Any field that returns no results might be
removable. 
> This won't catch all of them -- for example, a field set to be returned in
> results only (i.e. not indexed) doesn't get analyzed using this method.  But
it
> will catch some of them.
> 3) This one's highly unlikely.  But I worked with one client where we used
> source code generation techniques to build code that wrapped both the query
API
> and the content API.  Thus, any field that was needed by the code would
generate
> an error in the dev's IDE if the Index Profile didn't include it.  So we
walked
> through a few iterations of trying different Index Profiles, until we got the
> smallest one that didn't generate any errors.  Of course, this method won't
work
> unless you're doing some of the heavy customization that this one client was.
> 4) Drop a special "spy" stage at the end of each pipeline that is used. 
Re-feed
> data.  Then write a script analyzing the "spy" stage output to identify all
> fields that are used.
>
> Hopefully others will be able to contribute ideas.
>
> Regards,
> Garth Grimm
> Avery Ranch Consulting
>  ------------------------
> All things can be solved by salt water:
> sweat, tears, or the sea
>
>
>
> ----- Original Message ----
> From: pgupta_1623 <dasgupta.patrick@...>
> To: fast_dev@yahoogroups.com
> Sent: Fri, January 21, 2011 6:42:49 AM
> Subject: [fast_dev] get fields which are not used in any collection
>
> Hi all,
>
> we have deployed fast esp 5.1 since some years and have several collections
> which share a single index-profile.
>
>
> What is the easiest way to get all fields which are not in used in any
> collection?
>
> Thanks,
> Patrick
>
>
>
> ------------------------------------
>
> Yahoo! Groups Links
>





Wed Jan 26, 2011 11:03 am

pgupta_1623
Offline Offline
Send Email Send Email

Message #112 of 113 |
Expand Messages Author Sort by Date

Hi all, we have deployed fast esp 5.1 since some years and have several collections which share a single index-profile. What is the easiest way to get all...
pgupta_1623 Offline Send Email Jan 21, 2011
2:55 pm

I don't really know of an "easy" way to do this (so I hope some others provide better answers), but here's some ideas: 1) If you've been keeping...
Garth Grimm
gdgrimm Offline Send Email
Jan 21, 2011
3:30 pm

Garth, thanks for your suggestions. Eventually we used following approach, which seems to work: - get a sample fixml file of a collection we want to check (via...
pgupta_1623 Offline Send Email Jan 26, 2011
12:14 pm

Hello, Sounds good.  Using fixml instead of the output of a 'spy' stage is a good idea. The only thing that might be an issue is if the 'sample' document...
Garth Grimm
gdgrimm Offline Send Email
Jan 26, 2011
12:20 pm
< Prev Topic  |  Next Topic >
Advanced

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help