Garth, thanks for your suggestions.
Eventually we used following approach, which seems to work:
- get a sample fixml file of a collection we want to check (via getfixml)
- for each fieldname defined in index-profile,
check if it exists in the fixml file (we used
'name=\"{0,1}[a-zA-Z0-9]+$FIELDNAME'.
This regexp might be too greedy but works for us.)
- if there are no matches found, the field seems not to be used in the
collection.
Regards,
Patrick
--- In
fast_dev@yahoogroups.com, Garth Grimm <gdgrimm@...> wrote:
>
> I don't really know of an "easy" way to do this (so I hope some others provide
> better answers), but here's some ideas:
>
> 1) If you've been keeping documentation on what fields are read from each data
> source, and what fields might be created in any customized pipeline
> stages, review it to identify the ones that are needed. Whatever fields are
> left over are candidates for removal. Take that list and compare it to the
> fields that are in the default Index Profile, removing from the list any
fields
> that there. This should give you a list of fields that can be removed.
> 2) You could write a script that steps through each field in the Index Profile
> doing a wildcard query. Any field that returns no results might be
removable.
> This won't catch all of them -- for example, a field set to be returned in
> results only (i.e. not indexed) doesn't get analyzed using this method. But
it
> will catch some of them.
> 3) This one's highly unlikely. But I worked with one client where we used
> source code generation techniques to build code that wrapped both the query
API
> and the content API. Thus, any field that was needed by the code would
generate
> an error in the dev's IDE if the Index Profile didn't include it. So we
walked
> through a few iterations of trying different Index Profiles, until we got the
> smallest one that didn't generate any errors. Of course, this method won't
work
> unless you're doing some of the heavy customization that this one client was.
> 4) Drop a special "spy" stage at the end of each pipeline that is used.
Re-feed
> data. Then write a script analyzing the "spy" stage output to identify all
> fields that are used.
>
> Hopefully others will be able to contribute ideas.
>
> Regards,
> Garth Grimm
> Avery Ranch Consulting
> ------------------------
> All things can be solved by salt water:
> sweat, tears, or the sea
>
>
>
> ----- Original Message ----
> From: pgupta_1623 <dasgupta.patrick@...>
> To:
fast_dev@yahoogroups.com
> Sent: Fri, January 21, 2011 6:42:49 AM
> Subject: [fast_dev] get fields which are not used in any collection
>
> Hi all,
>
> we have deployed fast esp 5.1 since some years and have several collections
> which share a single index-profile.
>
>
> What is the easiest way to get all fields which are not in used in any
> collection?
>
> Thanks,
> Patrick
>
>
>
> ------------------------------------
>
> Yahoo! Groups Links
>