Catalog data entities

This article provides guidance on how to configure catalog data entities in the Intelligent Recommendations data contract.

Data entities review

A data entity is a set of one or more data text files, each having a list of columns (also called attributes) and rows containing the actual data values.

Intelligent Recommendations defines logical groups of data entities, each with its own purpose.

Note

Data entities are optional, unless explicitly stated otherwise, which means that their data can be empty or missing.

Go to the full list of data entities

Introduction to catalog data entities

The catalog data entity represents all items and item variants that are candidates for appearing in recommendations results. Candidates are determined by applying availabilities to items, a date that tells the system to include an item in the recommendations results. Without a valid availability, items are ignored when results are returned.

Intelligent Recommendations supports the following features and scenarios:

  • Items with multiple variations (for example, a shirt in different sizes or colors) or no variations at all. We refer to these variations as variants. Items that have no variants are called standalone items, while items with at least one variant are called item masters.

  • Assigning filter values to items (for example, category, color, or size). Later, when querying for recommendations, you can filter by these filter values.

  • Assigning images to items.

  • Items may be available within different logical entities within the organization. Intelligent Recommendations supports two levels of hierarchies:

    • Channel: Items can be assigned to a channel, allowing Intelligent Recommendations to provide recommendations scoped to only products included in a specific channel. All items are automatically associated with the default channel, which uses the string 0 (zero) as the reserved channel ID.

      Example:

      In this example, the dataset contains only three items: X, Y, and Z. These three items are automatically assigned to the default channel (Channel=0). You can also assign these items to your own custom channels. For example, you can assign items X and Y to Channel=C1 and items Y and Z to Channel=C2.

      So, when requesting recommendations, you can pass these other query parameters:

      • No Channel parameter (equals default channel): All three items can be returned in the response
      • Channel=0: Same as no parameter since this channel is the default
      • Channel=C1: Only items that belong to C1 channel (items X and Y) may be returned in the response
      • Channel=C2: Only items that belong to C2 channel (items Y and Z) may be returned in the response
      • Channel=SomethingElse: Empty response because this channel wasn’t defined and no items are assigned to it
    • Catalog: A catalog is another, finer level of availability granularity. It allows you to define multiple catalogs within a channel and get recommendations for specific catalogs. Similar to a channel, all items are automatically associated with the default catalog within a channel, which uses the string 0 (zero) as the reserved catalog ID.

      Example:

      Continuing with the Channel example, you have items X, Y, and Z. You assigned items X and Y to channel C1, and they're automatically assigned to the default catalog in the channel (using Catalog=0). You can have further granularity by assigning these items to custom catalogs within the channel. Let’s assign item X to Catalog=A and items X and Y to Catalog=B.

      So, when requesting recommendations, you can pass these other query parameters:

      • Channel=C1: No catalog parameter, equals default catalog. Both items X and Y can be returned in the response.
      • Channel=C1&Catalog=0: Same as no catalog parameter because this catalog is the default.
      • Channel=C1&Catalog=A: Only items that belong to A catalog in channel C1 (item X only) may be returned in the response.
      • Channel=C1&Catalog=B: Only items that belong to B catalog in channel C1 (items X and Y) may be returned in the response.
      • Channel=C1&Catalog=SomethingElse: Empty response because this catalog wasn’t defined in channel C1 and no items are assigned to it.
  • Declare item availabilities:

    • Availability start/end dates: Items outside of their availability time range will be excluded from the recommendation response.
    • Fine granularity of availability: Define the start/end dates within specific channel/catalog IDs.

The catalog is composed of several data entities, all optional (depending on which features you want to use), and can remain empty (or missing) from the Intelligent Recommendations root folder. Follow the guidelines in Reco_ItemsAndVariants data entity, described as follows, if you don't want to provide this data entity.

List of Catalog data entities

The following data entities are part of the catalog:

Go to the full list of data entities

Items and variants

Data entity name: Reco_ItemsAndVariants

Description: All items and item variants

Attributes:

Name Data type Mandatory Default value Invalid value behavior Comments
ItemId String(16) Yes Drop entry See Required data entities per recommendations scenario for item ID.
ItemVariantId String(16) No Drop entry See Required data entities per recommendations scenario for item variant ID.
Title String(256) No Trim value Length limited to 256 characters.
Description String(2048) No Trim value Length limited to 2048 characters.
ReleaseDate DateTime No 1970-01-01T00:00:00.000Z Drop entry See Required data entities per recommendations scenario for DateTime values.

Guidelines:

  • Item variants inherit the attributes of their item master. For example, if an item variant has no title, it inherits the title of its item master (that is, the row with the same ItemId but with an empty ItemVariantId) if it exists.

  • ItemIds may have a one-to-many relationship with ItemVariantIds. It's possible that a singe ItemId is mapped to more than one ItemVariantId to capture the relationship from an item master to its item variants. It's possible to have a singe entry for a specific ItemId and ItemVariantId combination without specifying other ItemId to ItemVariantId combinations.

  • The ReleaseDate attribute represents the date at which the item was released (published, introduced) on the market. This attribute is different from the availability of an item (when an item/product can be returned in an API call), but ReleaseDate might be used in scenarios like New and Trending, which rely on dates for item ordering.

  • If this data entity is empty (or missing), Intelligent Recommendations will automatically use all items and item variants found in the Reco_Interactions data entity as the set of catalog items and assign each item and item variant with the default title, description, and release date. These items are considered as always available unless they were assigned explicit availabilities in the Reco_ItemAndVariantAvailabilities data entity.

  • Intelligent Recommendations can use the Title and Description attributes to provide textual-based recommendations. Because Intelligent Recommendations currently supports only the en-us locale for textual recommendations, providing the Title and Description in any other locale might degrade the textual recommendations quality.

Sample data:

Headers appear for convenience only and shouldn't be part of the actual data.

ItemId ItemVariantId Title Description ReleaseDate
Item1 2018-05-15T13:30:00.000Z
Item1 Item1Var1 Black sunglasses Black sunglasses for children 2018-08-01T10:45:00.000Z
Item1 Item1Var2 Brown sunglasses Brown sunglasses for adults
Item2 Glasses cleaning cloth 2019-09-20T18:00:00.000Z
Item3 Item3Var1

Return to the list of catalog data entities

Item categories

Data entity name: Reco_ItemCategories

Description: all item categories.

Attributes:

Name Data type Mandatory Default value Invalid value behavior Comments
ItemId String(16) Yes Drop entry See Required data entities per recommendations scenario for item ID.
Category String(64) Yes Trim value Length limited to 64 characters.

Guidelines:

  • Each ItemId can have multiple categories, meaning it can appear in multiple entries in the data.

  • If your data is constructed using category trees, you need to supply the full set of categories (flattened) for each item.

Sample data:

Headers appear for convenience only and shouldn't be part of the actual data.

ItemId Category
Item1 Category1
Item1 Category1_subCategoryX
Item1 Category1_subCategoryY
Item2 Category1_subCategoryX

Return to the list of catalog data entities

Item and variant images

Data entity name: Reco_ItemAndVariantImages

Description: All item and item variant images

Attributes:

Name Data type Mandatory Default value Invalid value behavior Comments
ItemId String(16) Yes Drop entry See Required data entities per recommendations scenario for item ID.
ItemVariantId String(16) No Drop entry See Required data entities per recommendations scenario for item variant ID.
ImageFullUrl String(2048) Yes Drop entry Must be an absolute URL. The URL should be properly encoded (using percent-encoding). Length limited to 2048 characters.
IsPrimaryImage Bool Yes See guidelines See Required data entities per recommendations scenario for Boolean values.

Guidelines:

  • You must explicitly assign images to an ItemId and to each relevant ItemVariantId. Images assigned to an item aren't automatically assigned to all item variants and vice-versa. Images assigned to an item variant aren't automatically assigned to the item master of that variant.

  • If more than one primary image is specified for the same <ItemId, ItemVariantId> combination, only one of these images will be used for the visual recommendations inference step and the others are used only when training the entire visual model.

  • For any image that Intelligent Recommendations failed to access, the image URL is ignored and not used for the recommendation model.

  • If the IsPrimaryImage value is invalid, a value of false will be used (for example, nonprimary image).

  • If only nonprimary images were specified for an item or item variant, Intelligent Recommendations use one of the specified images as a primary image to still provide visual recommendations for that item or item variant.

  • There are two types of supported URLs:

    • Publicly available HTTPS URLs: Doesn't require an Authorization header. This URL doesn't include URLs of Azure blobs that are publicly/anonymously available, which aren't supported.
    • Azure blob storage URLs that require authentication: Aren't publicly/anonymously available. Permissions for reading the image blobs should be granted to Intelligent Recommendations, as explained in Deploy Intelligent Recommendations). Blob URLs must start with the prefix: https://<StorageAccountName>.blob.core.windows.net/.
  • The maximum supported size for a single image is 512 KB. Any image larger than 512 KB will be ignored by the system.

  • The ContentType for the image must have an image content type (it should start with image). This requirement applies to all images, both available via HTTPS and image blobs (via the blob ContentType property).

Sample data:

Headers appear for convenience only and shouldn't be part of the actual data.

ItemId ItemVariantId ImageFullUrl IsPrimaryImage
Item1 https://my.server.org/images/Item1_primary.jpg True
Item1 https://my.server.org/images/Item1_secondary.jpg False
Item1 Item1Var1 https://my.server.org/images/Item1Var1.jpg True
Item2 https://my.server.org/images/Item2.jpg True

Return to the list of catalog entity types

Item and variant filters

Data entity name: Reco_ItemAndVariantFilters

Description: Item and item variant properties used for runtime results filtering

Attributes:

Name Data type Mandatory Default value Invalid value behavior Comments
ItemId String(16) Yes Drop entry See Required data entities per recommendations scenario for item ID.
ItemVariantId String(16) No Drop entry See Required data entities per recommendations scenario for item variant ID.
FilterName String(64) Yes Trim value
FilterValue String(64) Yes Trim value Length limited to 64 characters.
FilterType String Yes Drop entry Possible values include: Textual, Numeric.

Guidelines:

  • Items and item variants have a parent-child relationship. This guideline means that Item variants will inherit the filters of their item master. For example, if the “Color” filter was declared for a certain ItemId, all item variants of the same ItemId get the same “Color” filter value, unless a different “Color” value was specified for the item variant.

  • Textual filter types support the "equals" filtering operation. For example, API requests can filter items with "Color"="Blue".

  • Numeric filter types support "range" filtering operations. For example, API requests can filter items with "Size" > 40.

  • You can assign multiple filter values to the same filter. For example, for the "Color" filter, you can provide multiple values, like "Green" and "Blue". In this example, the relevant item has two values for the "Color" filter and will be returned when you filter for either "Green" items or "Blue" items. To assign multiple values to the same filter, add an entry for each filter value you want to assign, using the same FilterName and FilterType values.

  • For each FilterName, an item variant can either inherit its parent filter values or override them. Merging the two isn't supported. By default, if the variant has no values assigned to a filter, it inherits the parent item filter values. If at least one filter value is assigned to a filter for an item variant, then override mode is switched on and only the variant filter values are effective (for the specific filter only). This value means that to achieve a "merge" behavior, the item variant must repeat its parent filter values. For example, an item supports two colors, Blue and Green. If a variant supports another color, Red, then the variant must list all three colors assigned to the variant ID: Blue, Green, and Red. In this example, the item variant has overridden the values for the "Color" filter, but it can still inherit the values for other filters from its parent item.

  • Entries with unsupported filter types will be ignored.

  • You can provide up to 20 different FilterName.

  • Providing multiple entries with the same FilterName but a different FilterType will fail the intelligent recommendations data ingestion process.

  • Items or item variants can have no filters specified. If you specify any filter in the API request, the items or item variants without the specified filter will be filtered out.

Sample data:

Headers appear for convenience only and shouldn't be part of the actual data.

ItemId ItemVariantId FilterName FilterValue FilterType
Item1 Color Red Textual
Item1 Item1Var1 Color Burgundy Textual
Item1 Item1Var2 Style Rectangular Textual
Item2 Size 38 Numeric
Item2 Color Blue Textual
Item2 Color Green Textual

Return to the list of catalog entity types

Item and variant availabilities

Data entity name: Reco_ItemAndVariantAvailabilities

Description: All item and item variant availabilities

Attributes:

Name Data type Mandatory Default value Invalid value behavior Comments
ItemId String(16) Yes Drop entry See Required data entities per recommendations scenario for item ID.
ItemVariantId String(16) No Drop entry See Required data entities per recommendations scenario for item variant ID.
StartDate DateTime No 0001-01-01T00:00:00.000Z See guidelines See Required data entities per recommendations scenario for DateTime values.
EndDate DateTime No 9999-12-31T23:59:59.999Z See guidelines See Required data entities per recommendations scenario for DateTime values.
Double Attribute Double No A double attribute that can be used according to the business' needs and doesn't affect the modeling process.
Channel String (64) No 0 Trim value Length limited to 64 characters.
Catalog String (64) No 0 Trim value Length limited to 64 characters.

Guidelines:

  • Reminder: availabilities tell the system what items or item variants are considered candidates for recommendations results.

  • The availability of an item variant is the union of availabilities of its item master with the availability of the item variant itself. Even item variants that have no entries inherit their item master availabilities.

  • An item that is missing from this data entity will be considered as always available in the default channel and catalog. More specifically, Intelligent Recommendations behave exactly as if that item appears in the data with default values for all attributes.

  • ItemIds have a one-to-many relationship with ItemVariantIds. While an ItemId isn't required to have an ItemVariantId, it's possible that more than one ItemVariantId can be mapped to a single ItemId. For example, you can add an entry for a specific ItemId and ItemVariantId combination without also explicitly adding another entry for the ItemId (and an empty ItemVariantId). When determining whether item variants have valid availabilities, only the specified item variants are considered as available (at the specified time intervals per each variant).

  • A catalog is relevant only in the context of a channel (Catalogs are a subset of channel). For example, catalog=MySale in channel=Europe is a different catalog than catalog=MySale in channel=Asia.

  • If your dataset contains multiple channels and catalogs, you need to add an entry for each relevant channel and catalog combination for each relevant item and item variant.

  • Availability dates are relevant only for the specific channel and catalog specified. If you want to specify the same availability dates for different channels and catalogs, you need to explicitly add an entry for each channel and catalog.

  • If there's an invalid value for either of the attributes StartDate or EndDate, the entire entry is modified to represent an unavailable item. Both StartDate and EndDate values are overridden with DateTime values that are in the past.

  • The 'Double Attribute' can be left empty.

  • Don't use "0" as a value for "Channel". This value is reserved for the system. Using "0" will result in a processing error.

Sample data:

Headers appear for convenience only and shouldn't be part of the actual data.

ItemId ItemVariantId StartDate EndDate Double Attribute Channel Catalog
Item1 2020-08-20T10:00:00.000Z
Item1 Item1Var1 2020-08-01T12:00:00.000Z
Item2 2020-04-01T10:00:00.000Z 2020-04-15T23:59:59.999Z 15.0
Item2 2020-04-01T10:00:00.000Z 9.76
Item3 2020-05-01T12:00:00.000Z Europe MySale

Return to the list of catalog entity types

See also

Data contract overview
Data entities mapping table
Interactions data entities
Reco configuration data entities
Opted-out users data entities
External lists data entities
Recommendations enrichment data entities
Image to item mapping data entities
Intelligent Recommendations API
Quick start guide: Set up and run Intelligent Recommendations with sample data