A bookstore wants visitors searching *"1920s American novel"* to find The Great Gatsby. A real estate site wants *"two-bedroom condo near downtown"* to surface the right listings. A recipe site wants *"vegan Italian dinner under 30 minutes"* to land on the right pages. All three need search that understands data living outside the post body — in custom taxonomies like book_genre, book_era, property_type, location, cuisine, dietary, course.
WordPress's default search doesn't read taxonomies. Most third-party search plugins don't either. The current Queryra plugin does — automatically, with no setup. This post explains what changed, what it means for stores with rich data models, and how it works under the hood.
What are custom taxonomies, and why default WordPress search ignores them
WordPress lets you register any number of custom taxonomies — categories of structured data attached to posts, products, or any custom post type. Built-in examples: category (for posts) and product_cat (for WooCommerce). Custom examples that sites add for their own data models:
- A bookstore registers
book_genre,book_author,book_publisher,book_series,book_era - A real estate plugin registers
property_type,location,amenities,building_year - A recipe plugin registers
cuisine,dietary,course,difficulty,cooking_method - A music streaming site registers
artist,album,genre,mood,instrument - An events plugin registers
venue,event_category,organizer,event_type
These taxonomies are real data in the WordPress database (wp_terms + wp_term_relationships), but default WordPress search ignores them. Search runs as a SQL LIKE '%query%' against post_title and post_content only. A book tagged book_genre: fiction won't appear when a visitor searches *"fiction novel"* unless the word "fiction" literally appears in the book's description.
This is the same architectural blind spot that affects WordPress's relationship with page builders — important data lives in the database, but the search engine doesn't know how to read it.
What Queryra does with custom taxonomies
The current Queryra plugin auto-detects every public taxonomy registered on your site and sends terms to the AI search index alongside the standard fields. Specifically:
- Auto-detection. The plugin queries WordPress for taxonomies registered with
public => true, on every sync. No whitelist to maintain, no settings to toggle. Add a new taxonomy from your theme or another plugin — it's picked up on the next sync. - Smart exclusions. Built-in taxonomies covered by dedicated fields are skipped automatically:
category,post_tag,product_cat,product_tag,product_brand,yith_product_brand,pwb-brand,post_format,nav_menu,link_category. These are already searchable via the standardcategories/tags/brandplumbing. - Structured payload. Remaining taxonomies are sent to the Queryra API as a map of slugs to comma-separated term names, e.g.:
{
"book_genre": "fiction, classic, american-literature",
"book_author": "F. Scott Fitzgerald",
"book_era": "1920s",
"book_setting": "new york"
}On the backend, each taxonomy contributes two things simultaneously: it's appended to the embedding text (so AI semantic search picks up *"twentieth century novel"* matching a book tagged book_era: 1920s), and it's stored as filterable metadata (so future intent-aware queries like *"vegan Italian dinner"* can pre-filter by cuisine: italian + dietary: vegan before the semantic step).
Real-world examples by industry
The most useful way to think about this is by use case. A few common patterns:
### Bookstore / library
Taxonomies typical to a Books custom post type or WooCommerce books category:
book_genre— fiction, non-fiction, fantasy, romance, mysterybook_author— F. Scott Fitzgerald, Ursula K. Le Guinbook_publisher— Penguin Classics, Vintagebook_series— Earthsea, Wheel of Timebook_era— 1920s, Victorian, Contemporarybook_setting— New York, Middle Earth, dystopian future
A search like *"twentieth century American novel about ambition"* now lands on Gatsby — the semantic match catches "twentieth century" against book_era: 1920s (close enough), "American" against book_genre: american-literature, and "ambition" semantically resolves through the book's description plus author context.
### Real estate listings
property_type— condo, single-family, multi-family, commerciallocation— Park Slope, Williamsburg, downtown Bostonamenities— pool, parking, gym, doormanbuilding_year— pre-war, mid-century, new construction
*"Pre-war doorman building in Park Slope"* — three taxonomies in one query, all extracted from the natural language and matched against listings.
### Recipes
cuisine— Italian, Mexican, Japanesedietary— vegan, gluten-free, ketocourse— appetizer, main, dessertdifficulty— easy, intermediate, advancedcooking_method— baked, grilled, no-cook
*"Easy gluten-free dessert"* picks up difficulty: easy + dietary: gluten-free + course: dessert simultaneously.
### Music / streaming
artist— David Bowie, Joni Mitchellalbum— Hunky Dory, Bluegenre— folk-rock, glam-rock, ambientmood— melancholic, energetic, contemplativeinstrument— piano-driven, guitar-driven
*"Melancholic piano-driven folk"* — three custom taxonomies, all part of the same semantic vector.
### Events
venue— Madison Square Garden, Roundhouse Londonevent_category— concert, conference, workshoporganizer— official festival, independent promoter
### WooCommerce stores
WooCommerce stores using global product attributes (defined in Products → Attributes) get those indexed automatically via WordPress's pa_* taxonomies (pa_color, pa_size, pa_material, pa_brand). Any custom taxonomy you register beyond those — recommended_age, season, room, style — joins the same pipeline.
The pattern repeats across nearly every niche store. Once your data is modelled as taxonomies (which is how WordPress encourages you to model structured attributes), AI semantic search can use it without any per-site configuration.
How it works under the hood
The flow is straightforward and mirrors how categories and brand have always worked — just expanded to N custom taxonomies:
- WordPress plugin side. On sync, the plugin calls
get_taxonomies(['public' => true])to discover taxonomies, thenwp_get_object_terms($post_id, $taxonomy)to fetch terms for each post. Terms are joined with commas and packaged into ataxonomiesfield in the API record payload.
- Backend storage. Queryra's records database has a dedicated
taxonomiescolumn (JSONB on Postgres). The map is stored as-is.
- Embedding text. During the sync-to-search-index step, each taxonomy is appended as a
"Label: terms"line to the document text that goes into the AI embedding — same pattern as the existing"Brand: nike"and"Categories: Books, Fiction"lines. For a Gatsby record, the embedding text gets:
The Great Gatsby. Brand: penguin classics. Categories: Books, Fiction.
Book Genre: classic, american-literature.
Book Author: F. Scott Fitzgerald.
Book Era: 1920s.
Book Setting: new york.
[full description text follows]Result: the AI embedding for this record now "knows" about the era, the author, the setting — even though none of those words might appear in the post body.
- Filterable metadata. Each taxonomy is also flattened into ChromaDB metadata with a
tax_prefix:tax_book_era: "1920s",tax_book_setting: "new york", lowercased for case-insensitive matching. This enables future intent-parser-driven filters: *"books set in New York"* can pre-filter to records wheretax_book_settingcontains "new york" before the semantic search step.
- Hash-based change detection. When you edit a term assignment (add
book_era: 1920sto a post that previously hadbook_era: contemporary), the plugin's content hash changes — triggering a re-sync of that specific record. No full re-import needed.
Tuning what gets indexed
Auto-detection works for most sites without setup. For sites that need precise control — private taxonomies that shouldn't go to search, slug renames for cleaner labels, or restricting to a specific list — Queryra exposes the queryra_indexable_taxonomies filter:
add_filter('queryra_indexable_taxonomies', function ($taxonomies_map, $post) {
// Only index specific taxonomies on the 'book' post type
if ($post->post_type === 'book') {
return array_intersect_key(
$taxonomies_map,
array_flip(['book_genre', 'book_author', 'book_era'])
);
}
return $taxonomies_map;
}, 10, 2);The filter receives the auto-detected map and returns whatever you want sent. Common patterns: whitelist a small subset, exclude internal/private taxonomies, rename slugs for readability before they hit the embedding (e.g. book_genre → Book Genre happens automatically, but you can override).
Full documentation of this filter and the companion queryra_indexable_meta_content lives in our developer filters guide (coming soon).
What you don't need to do
Worth being explicit about, because the absence of setup is the feature:
- No whitelist to maintain. Every public taxonomy on your site gets indexed automatically. Add a new taxonomy next month — picked up on next sync.
- No editor changes. Authors keep using the standard WordPress UI to assign terms (the metabox in the post editor sidebar). No special editing experience required.
- No theme modifications. Queryra hooks into WordPress at the SQL search layer — your
single-book.phptemplate, your archive pages, your shop layout — all unaffected. - No re-import. Once you upgrade Queryra, the next sync (manual or automatic on post save) picks up taxonomies. Existing records get refreshed naturally on edit.
- No data migration. Taxonomies stay in WordPress's standard
wp_terms+wp_term_relationshipstables. Queryra reads them via the WordPress API.
TL;DR
- Default WordPress search ignores custom taxonomies (book_genre, material, property_type, cuisine, venue, etc.) — they don't live in
post_content, so SQLLIKEqueries miss them. - Queryra now auto-detects every public custom taxonomy on your site and sends terms to the AI semantic search index.
- Built-in taxonomies (category, post_tag, product_cat/tag, brand variants) are skipped — already covered by dedicated fields.
- Each taxonomy contributes to both the embedding (semantic match) and filterable metadata (precise filter) — same pattern as how categories and brand have always worked.
- Use cases by industry: bookstore (book_genre, book_era), real estate (property_type, amenities), recipes (cuisine, dietary), music (mood, instrument), events (venue), and WooCommerce stores with custom attribute taxonomies.
- Zero setup. Auto-detection on every sync. For precise control, the
queryra_indexable_taxonomiesdeveloper filter lets you override the auto-detected list.
If your site has rich custom data models, this is the difference between *"the search engine misses half my product attributes"* and *"every meaningful attribute on every record is searchable"* — without anyone touching the editor experience.