Tapping into Topic Searches John Allec, CRSFindhelp Information Services
Topic searches • are pre-constructed database searches, carefully built to automatically generate comprehensive and up-to-date search results.
Process for 211Ontario.ca topic search development • Creating the topic list • Draft search logic • French issues • Testing of search logic • Final adjustments • Reference materials • Implementation • Maintenance
1. Creating the topic list • A team with representatives of all 211 centres met to brainstorm a draft list of categories. • A draft document was drawn up and a second meeting convened for review. • 22 categories were eventually selected, most of which have several sub-categories (see handout). Only the sub-categories actually trigger searches.
1. …Creating the topic list • Terminology should be up-to-date (e.g. ‘Older Adults’ rather than ‘Seniors’). • Opt for the general where possible: • ‘Finding a health professional’ rather than ‘Finding a medical professional’. • ‘New Parents’ rather than sub-terms ‘New Mothers’ and/or ‘New Fathers’.
1. …Creating the topic list • Province-wide consultation was crucial to identifying differing priorities. • For example, not only was a Transportation category added to the original set, to reflect rural needs, but that was divided into 4 subcategories.
1. …Creating the topic list • Topics can be for either services or groups of people. • Using both results in repetition across categories: • Youth – Recreation, and • Recreation – Youth • This should be kept to a minimum especially if the logic for each is stored in separate places. There should be unmissable cues that any change must be made in both places.
1. …Creating the topic list • Decision to allow a miscellaneous ‘Other services’ sub-category under certain topics (e.g. Food) to capture services that would otherwise be omitted (community gardens, farmers markets). • But it’s a definite challenge to exclude the services already listed in the more specific sub-categories. Those must be excluded though to avoid an ‘everything’ category.
2. Draft search logic • A master reference document should be developed that incorporates all the search logic and all data related to the topic searches (e.g. definitions, French equivalents). • This should be centrally maintained but easily accessible by all involved. It is crucial that everyone feel confident they are consulting the latest, up-to-date version.
2. …Draft search logic • Any type of database searching may be incorporated into the search logic. • But… in order for the system to be as self-maintaining as possible, stable and consistent data components – that are maintained regardless of topic search work – should always be the first choice. • Searching through Description text, for example, should almost always be avoided.
2. …Draft search logic • The upcoming searches on 211Ontario.ca are built on: • (mostly) Taxonomy indexing • where necessary, program names and acronyms (not as stable, and can bring noise) • as a last resort, drawing on records identified by a special code (e.g. ‘pub codes” in CIOC) (relies on people remembering to add codes)
2. …Draft search logic • The Taxonomy search logic avoids using Taxonomy term names, and relies instead on the alphanumerical coding: • Term names may change, slightly or entirely. • Allows equivalent searching in French • Allows truncated searching, e.g. BD*
2. …Draft search logic • Topic search work can result in key decisions on Taxonomy indexing practices in general, especially in a wide-flung network. • Ideally, therefore, topic searches would be finalized and in place before a database is converted to the Taxonomy, to avoid having to re-index later.
2. …Draft search logic • But either way, constructing and implementing topic searches is a great route to consolidated and consistent indexing, and isolating problem indexing. • e.g. ‘Congregate dining’ used for youth services • e.g. Links for ‘Older Adults ~ Training and Employment Programs’ rather than ‘Older Worker Employment Programs’
2. …Draft search logic • Generally, the more flexible and general the search is, the less likely it is to fail. • At the same time, there is a need to pinpoint the most relevant services and avoid unsuitable results. • The ‘roll-up’ function allows for more detailed indexing choices among partners where suitable, e.g. for tourism and recreation.
2. …Draft search logic • See Also’s • These pointers to other Taxonomy terms that may be more suitable are valuable (and generally under-used). • They can give useful ideas about what else should be included in a search.
2. …Draft search logic • ‘Related concepts’ are lists of all possible terms across the Taxonomy that are relevant to a topic, available on 211Taxonomy.org. • These are very similar to topic searches, and always kept up-to-date. • Unfortunately they are often too massive to incorporate easily into a search string. But they are invaluable for reference.
2. …Draft search logic • Example in English (almost…) • Aboriginal communities - Women • (Breast Implants OR Breast Cancer OR Cervical Cancer OR Ovarian Cancer OR Menopause OR Pelvic Inflammatory Disease OR PMS OR Pregnancy/Birth Problems OR Mastectomy Patients OR Widows OR Breastfeeding Women OR Single Parent Families Headed by Mothers OR Mothers OR New Mothers OR Postpartum Mothers OR Pregnant Women OR Stay at Home Mothers OR Surrogate Mothers OR Teenage Mothers OR Single Women OR Females OR Lesbians OR Transgender Individuals OR Battered Women OR Women's Issues) LINKED WITH Aboriginal communities
2. …Draft search logic • Example in Oracle-speak: • Aboriginal communities - Women • ((YF-3000.1340_YH-6000.0120%) OR (YF-3000.1480-120_YH-6000.0120%) OR (YF-3000.1480-150_YH-6000.0120%) OR (YF-3000.1480-650_YH-6000.0120%) OR (YF-3000.5040_YH-6000.0120%) OR (YF-3000.6240_YH-6000.0120%) OR (YF-3000.6640_YH-6000.0120%) OR (YF-3000.6840_YH-6000.0120%) OR (YF-6000.8000-500_YH-6000.0120%) OR (YH-6000.0120% NEAR YJ-0900.1350-970) OR (YH-6000.0120% NEAR YJ-0920) OR (YH-6000.0120% NEAR YK-2000.8000-900) OR (YH-6000.0120% NEAR YK-6500.4900) OR (YH-6000.0120% NEAR YK-6500.6100-650) OR (YH-6000.0120% NEAR YK-6500.6550) OR (YH-6000.0120% NEAR YK-6500.6600%) OR (YH-6000.0120% NEAR YK-6500.8050-840) OR (YH-6000.0120% NEAR YK-6500.8200) OR (YH-6000.0120% NEAR YK-6500.8500-900) OR (YH-6000.0120% NEAR YK-8200.8100) OR (YH-6000.0120% NEAR YS-2000%) OR (YH-6000.0120% NEAR YT-2400.4500) OR (YH-6000.0120% NEAR YT-2400.8500%) OR (YH-6000.0120% NEAR YX-0300.1200) OR (YH-6000.0120% NEAR YZ-9000)) within TID
3. French issues • There was no need identified for different topic searches in French – except a decision to have an extra category in French for ‘French designated agencies’. • Fortunately, search logic based on Taxonomy codes works for either language. (Yet another reason to base as much logic as possible on the Taxonomy.) • But… any logic that draws on program names or acronyms must be developed separately in French.
4. Testing of search logic • Field testing to be organized among the entire data network, especially to gauge regional differences (and encourage further buy-in). • All should have: • easy access to up-to-date search logic for reference • easy access to a test version of the public site • user-friendly test assignments, with carefully organized results reporting
4. …Testing of search logic • The data management software may have a different application than the public web site, as is the case for 211Ontario.ca. • Unfortunately this means different search strategies are likely needed to replicate the results, because of differences in search language. • These should of course be carefully saved along with the main search logic, preferably in the same document, for future work.
4. …Testing of search logic • Another alternative for cross-checking results is to compare against a reliable outside list, e.g. a government list of licensed child care centres. • That of course may show up discrepancies in the data, rather than in the search logic.
5. Final adjustments • Collection of final feedback • Corrections to search logic where necessary • Re-testing where necessary
6. Reference Materials • The up-to-date search logic should always be easily accessible by all involved. • Any program templates made available to data contributors (e.g. ‘Employment Resource Centres’) must reflect the choices made for topic searches, whether regarding Taxonomy indexing, program names, acronyms, etc.
6. …Reference Materials • A ‘starter set’ is a reference document that identifies the minimum required level of indexing in each area of the Taxonomy in a data network. • It is simply a list of all available terms indicating whether the default for each is activation or de-activation.
6. …Reference Materials • All in the database network are forbidden to index at a higher level than indicated in the starter set. • But conversely, all are encouraged to index at a deeper, more specific level if that better reflects local needs.
6. …Reference Materials Topic searches are typically built on the starter-level terms. But the ‘roll-up’ function and truncated search logic (e.g. BD*) assures that records indexed more specifically will be gathered up into the larger category. Ideally, the reference Starter Set would be constructed after topic searches are worked out, to avoid re-indexing, but that may not be possible.
7. Implementation • If you have an existing set of topic searches already in place: • Consider phasing in the new searches in groups, incorporating them one by one to replace the older ones.
7. …Implementation • If you are converting from an older indexing system to the Taxonomy: • For a transition period, consider having searches draw on both the old indexing system and the Taxonomy, especially if you’re not confident about the state of the Taxonomy indexing.
8. Maintenance • Very important: • A system must be in place to accommodate ongoing changes to the master Taxonomy.
8. …Maintenance • Must check for any changes that affect existing search logic: • New terms that are relevant • Terms that have disappeared • Code changes (re changes in term names, affecting sorting) • Other changes (definitions, Used Fors) can usually be safely ignored
8. …Maintenance • Any changes to search logic must of course also be reflected in any reference materials (templates, starter sets). • Other types of changes may also have an effect on search logic, e.g. acronyms for program names may change. • An annual review is suggested to make sure searches are still intact.
Questions?? • John Allec • email@example.com • 416-392-4572